← Back to context

Comment by johnfn

10 hours ago

If you can point me to any "random text generator" that scores a 76.8 on SWEBench, after any number of iterations, or in fact is competitive on any benchmark at all, I'll happily switch to it. Until then, I don't think that analogy will lead to particularly productive conversation. There are many engineers using a similar harness on LLMs today. No one uses a random text generator to generate code because you are not making a real suggestion.

> The problem is it's devouring your tokens as it does so. While you're on a subsidized plan that seems like a non-issue, but once the providers start charging you actual costs for usage.. yeah, the hallucinations will be a showstopper for you.

The discussion, and your original post, was about whether hallucinations are a meaningful issue today - not in some hypothetical future.