Comment by giancarlostoro
3 days ago
Are you using Claude Code? Because that might be the secret cause you're missing. With Claude Code I can instruct it to validate things after its done with code, and usually it finds that it goofed. I can also tell it to work on like five different things, and go "hey spin up some agents to work on this" and it will spawn 5 agents in parallel to work on said things.
I've basically ditched Groke et al and I refuse to give Sam Altman a penny.
For schema design phase I used web UI for all three.
Logical bug of using BIGSERIAL for tracking updates (generated at insert time, not commit time, so can be out of order) wouldn’t be caught by any number of iterations of Claude Code and would be found in production after weeks of debugging.
At this point having any LLM write code without giving it an environment that allows it to execute that code itself is like rolling a heavily-biased random number generator and hoping you get a useful result.
Things get so much more interesting when they're able to execute the code they are writing to see if it actually works.
So much this. Do we program by writing reams of code and never running the compiler until it's all written and then judging the programmer as terrible when it doesn't compile? Or do we write code by hand incrementally and compile and test as we go along? So why would do we think having the AI do that and fail is setting it up for success? If I wrote code on a whiteboard and was judged for making syntax errors, I'd never have gotten a job. Give the AI the tools it needs to succeed, just like you would for a human.
1 reply →