Comment by Eridrus
6 hours ago
Yeah, this is less of a benchmark and more "I like this one guys!".
Just totally subjective grading criteria of a single poorly defined example with no end use case in mind to guide how to even do evaluation.
6 hours ago
Yeah, this is less of a benchmark and more "I like this one guys!".
Just totally subjective grading criteria of a single poorly defined example with no end use case in mind to guide how to even do evaluation.
It's still interesting in a similar way to Simon Willison's Pelicans on a bicycle.
The Pelicans are mostly just entertainment.