Comment by aydyn
1 month ago
Yes but I am just giving an example of something recent, I could also point to pure logic errors if I go back and search my discussions.
Maybe you are on to something for "classifying" issues; the type of problems LLMs have are hard to categorize and hence it is hard to benchmark around. Maybe it is just a long tail of many different categories of problems.
No comments yet
Contribute on Hacker News ↗