Comment by rishramanathan

3 years ago

We’ve actually been building a testing and evaluation platform from the start, but started with discriminative ML tasks like classification and regression. We waited to do a Launch HN because we were mostly focused on enterprise / mid-market.

These past few months, however, we’ve prioritized building out features for testing and monitoring LLMs.

LLMs certainly have their unique challenges, but the evaluation problem in general is not new, and much of what we’ve built historically is very much applicable to this new crop of ML use cases!

0 comments

rishramanathan

No comments yet

Contribute on Hacker News ↗