Comment by rishramanathan
2 years ago
We’ve actually been building a testing and evaluation platform from the start, but started with discriminative ML tasks like classification and regression. We waited to do a Launch HN because we were mostly focused on enterprise / mid-market.
These past few months, however, we’ve prioritized building out features for testing and monitoring LLMs.
LLMs certainly have their unique challenges, but the evaluation problem in general is not new, and much of what we’ve built historically is very much applicable to this new crop of ML use cases!
No comments yet
Contribute on Hacker News ↗