Comment by CivBase
12 days ago
> Programming with AI is like tutoring a child. You teach the child, tell it where it made mistakes and you keep iterating and monitoring the child until it makes what you want.
Who are you people who spend so much time writing code that this is a significant productivity boost?
I'm imagining doing this with an actual child and how long it would take for me to get a real return on investment at my job. Nevermind that the limited amount of time I get to spend writing code is probably the highlight of my job and I'd be effectively replacing that with more code reviews.
Here's an example:
I recently inherited an over decade old web project full of EOL'd libraries and OS packages that desperately needed to be modernized.
Within 3 hours I had a working test suite with 80% code coverage on core business functionality (~300 tests). Now - maybe the tests aren't the best designs given there is no way I could review that many tests in 3 hours, but I know empirically that they cover a majority of the code of the core logic. We can now incrementally upgrade the project and have at least some kind of basic check along the way.
There's no way I could have pieced together as large of a working test suite using tech of that era in even double that time.
> maybe the tests aren't the best designs given there is no way I could review that many tests in 3 hours,
If you haven't reviewed and signed off then you have to assume that the stuff is garbage.
This is the crux of using AI to create anything and it has been a core rule of development for many years that you don't use wizards unless you understand what they are doing.
I used a static analysis code coverage tool to guarantee it was checking the logic, but I did not verify the logic checking myself. The biggest risk is that I have no way of knowing that I codified actual bugs with tests, but if that's true those bugs were already there anyways.
I'd say for what I'm trying to do - which is upgrade a very old version of PHP to something that is supported, this is completely acceptable. These are basically acting as smoke tests.
1 reply →
I code firmware for a heavily regulated medical device (where mistakes mean life and death), and I try to have AI write unit tests for me all the time, and I would say I spend about 3 days correcting and polishing what the AI gives me in 30 minutes. The first pass the AI gives me, likely saves a day of work, but you would have to be crazy to trust it blindly. I guarantee it is not giving you what you think it is or what you need. And writing the tests is when I usually find and fix issues in the code. If AI is writing tests that all pass without updating the code then it's likely falsely telling you the code is perfect when it isn't.
If you're using a code coverage tool to identify the branches its hitting in the code, you at least have a guarantee that it is testing the code its writing tests for as long as you check the assertions. I could be codifying bugs with tests and probably did (but they were already there anyways). For the purpose of upgrading OS libraries and surrounding software, this is a good approach - I can incrementally upgrade the software, run all the tests, and see if anything falls over.
I'm not having AI write tests for life-or-death software nor did I claim that AI wrote tests that all pass without updating any code.
You know they cause a majority of the code of the core logic to execute, right? Are you sure the tests actually check that those bits of logic are doing the right thing? I've had Claude et al. write me plenty of tests that exercise things and then explicitly swallow errors and pass.
Yes, the first hour or so was spent fidgeting with test creation. It started out doing it's usual whacky behavior like checking the existence of a method and calling that a "pass", creating a mock object that mocked the return result of the logic it was supposed to be testing, and (my favorite) copying the logic out of the code and putting it directly into the test. Lots of course correction, but once I had one well written test that I had fully proofed myself I just provided it that test as an example and it did a pretty good job following those patterns for the remainder. I still sniffed out all the output for LLM whackiness though. Using a code coverage tool also helps a lot.
... Yeah thise tests are probably garbage. The models probably covered the 80% that consists of boiler plate and mocked out the important 20% that was critical business logic. That's how it was in my experience.
For God's sake that's completely slop.
You should read my other comment - I did check that the test was actually checking the logic, so I guess I did some level of review with it.
it's not just writing code.
And maybe child is too simplistic of an analogy. It's more like working with a savant.
The type of thing you can tell AI to do is like this: You tell it to code a website... it does it, but you don't like the pattern.
Say, "use functional programming", "use camel-case" don't use this pattern, don't use that. And then it does it. You can leave it in the agent file and those instructions become burned into it forever.
A better way to put it is with this example: I put my symptoms into ChatGPT and it gives some generic info with a massive "not-medical-advice" boilerplate and refuses to give specific recommendations. My wife (an NP) puts in anonymous medical questions and gets highly specific med terminology heavy guidance.
That's all to say the learning curve with LLMs is how to say things a specific way to reliability get an outcome.
These people are just the same charlatans and scammers you saw in the web3 sphere. Invoking Ryan Dahl as some sort of authority figure and not a tragic figure that sold his soul to VC companies is even more pathetic.
Don't appreciate this comment. Calling me a charlatan is rude. He's not authority, but he has more credibility than you and most people on HN.
There is obvious division of ideas here. But calling one side stupid or referring to them as charlatans is outright wrong and biased.
No one called YOU a charlatan, get thicker skin because you are going to run into more and more people that absolutely hate these tools.
There is a reason why they struggle selling them and executives are force feeding them to their workers.
Charlatan is the perfect term for those that stand to make money selling half baked goods and forcing more mass misery upon society.