Comment by ErikBjare

1 day ago

A junior dev is not a good approximation of the strengths and weaknesses of these models.

Agreed! The comparison is great for estimating the scope of the tasks they're capable of--they do very well with bite-sized tasks that can be individually verified. But their world knowledge is that of a principal engineer!

I think this is why people struggle so much with agents--they see the agent perform magic, then assume it can be trusted with a larger task, where it completely falls down.

The post I originally commented on literally made that comparison when describing the models as a massive productivity boost.