Comment by aydyn

1 month ago

Yes but I am just giving an example of something recent, I could also point to pure logic errors if I go back and search my discussions.

Maybe you are on to something for "classifying" issues; the type of problems LLMs have are hard to categorize and hence it is hard to benchmark around. Maybe it is just a long tail of many different categories of problems.