Comment by koakuma-chan
1 day ago
Is anyone working on or knows a library for evaluating LLMs for application features and/or application features that use LLMs? I am wondering what people use or if anyone has their own solution.
1 day ago
Is anyone working on or knows a library for evaluating LLMs for application features and/or application features that use LLMs? I am wondering what people use or if anyone has their own solution.
There would be so much subjectivity to this. I like the idea but executing in a reliable, repeatable way would be very challenging imo.