Comment by throwaway0123_5

8 hours ago

> I have yet to see a "error" that modern frontier models make that I could not imagine a human making

I mostly agree if "a human" is just any person we pluck of the street. What I still see with some regularity is the models (right now, primarily Opus 4.6 through Claude Code) making mistakes that humans:

- working in the same field/area as me (nothing particularly exotic, subfield of CS, not theory)

- with even a fraction of the declarative knowledge about the field as the LLM

- with even a fraction of frontier LLM abilities suggested by their perf in mathematical/informatics Olympiads

would never make. Basically, errors I'd never expect to see from a human coworker (or myself). I don't yet consider myself an expert in my subfield, and I'll almost certainly never be a top expert in it. Often the errors seem to present to me as just "really atrocious intuition." If the LLM ran with some of them they would cause huge problems.