Comment by smlacy
3 hours ago
Post actual results, make a blog post. Don't just say "this sucks" without tangible evidence.
Otherwise you're doomed to "sample size of one" level of relevance.
3 hours ago
Post actual results, make a blog post. Don't just say "this sucks" without tangible evidence.
Otherwise you're doomed to "sample size of one" level of relevance.
Then your internal benchmarks will be in the post-training set and you’ll have to make new ones.
I have the opposite experience: random HN/Reddit comments saying “this sucks” or “whoa this is a huge improvement” are the only benchmark that means anything. Standard benchmarks are all gamed and don’t capture the complexity of the real world.