Comment by fragmede
15 hours ago
The canonical example I use is how good are (philosophical) you at programming on a whiteboard given one shot and no tools? Vs at your computer given access to everything? So judging LLMs on that rubric seems as dumb as judging humans by that rubric.
No comments yet
Contribute on Hacker News ↗