Comment by dust42
19 hours ago
Personally, I'd prefer if the AI models would start with a proof of their own statements. Time and again, SOTA frontier models told me: "Now you have 100% correct code ready for production in enterprise quality." Then I run it and it crashes. Or maybe the AI is just being tongue-in-cheek?
Point in case: I just wanted to give z.ai a try and buy some credits. I used Firefox with uBlock and the payment didn't go through. I tried again with Chrome and no adblock, but now there is an error: "Payment Failed: p.confirmCardPayment is not a function." The irony is, that this is certainly vibe-coded with z.ai which tries to sell me how good they are but then not being able to conclude the sale.
And we will get lots more of this in the future. LLMs are a fantastic new technology, but even more fantastically over-hyped.
You get AIs to prove their code is correct in precisely the same ways you get humans to prove their code is correct. You make them demonstrate it through tests or evidence (screenshots, logs of successful runs).
Yes! Also, make sure to check those results yourself, dear reader, rather than ask the agent to summarize the results for you! ^^;
We should differentiate AI models from AI apps.
Models just generate text. Apps are supposed to make that text useful.
An app can run various kinds of verification. But would you pay an extra for that?
Nobody can make a text generator to output text which is 100% correct. That's just not a thing people can do now.