Comment by robhlt
19 days ago
It's nice that AI can fix bugs fast, but it's better to not even have bugs in the first place. By using someone else's battle tested code (like a framework) you can at least avoid the bugs they've already encountered and fixed.
I spent Dry January working on a new coding project and since all my nerd friends have been telling me to try to code with LLM's I gave it a shot and signed up to Google Gemini...
All I can say is "holy shit, I'm a believer." I've probably got close to a year's worth of coding done in a month and a half.
Busy work that would have taken me a day to look up, figure out, and write -- boring shit like matplotlib illustrations -- they are trivial now.
Things that are ideas that I'm not sure how to implement "what are some different ways to do this weird thing" that I would have spend a week on trying to figure out a reasonable approach, no, it's basically got two or three decent ideas right away, even if they're not perfect. There was one vectorization approach I would have never thought of that I'm now using.
Is the LLM wrong? Yes, all the damn time! Do I need to, you know, actually do a code review then I'm implementing ideas? Very much yes! Do I get into a back and forth battle with the LLM when it gets starts spitting out nonsense, shut the chat down, and start over with a newly primed window? Yes, about once every couple of days.
It's still absolutely incredible. I've been a skeptic for a very long time. I studied philosophy, and the conceptions people have of language and Truth get completely garbled by an LLM that isn't really a mind that can think in the way we do. That said, holy shit it can do an absolute ton of busy work.
What kind of project / prompts - what’s working for you? /I spent a good 20 years in the software world but have been away doing other things professionally for couple years. Recently was in the same place as you, with a new project and wanting to try it out. So I start with a generic Django project in VSCode, use the agent mode, and… what a waste of time. The auto-complete suggestions it makes are frequently wrong, the actions it takes in response to my prompts tend to make a mess on the order of a junior developer. I keep trying to figure out what I’m doing wrong, as I’m prompting pretty simple concepts at it - if you know Django, imagine concepts like “add the foo module to settings.py” or “Run the check command and diagnose why the foo app isn’t registered correctly” Before you know it, it’s spiraling out of control with changes it thinks it is making, all of which are hallucinations.
I'm just using Gemini in the browser. I'm not ready to let it touch my code. Here are my last two prompts, for context the project is about golf course architecture:
Me, including the architecture_diff.py file: I would like to add another map to architecture_diff. I want the map to show the level of divergence of the angle of the two shots to the two different holes from each point. That is, when your are right in between the two holes, it should be a 180 degree difference, and should be very dark, but when you're on the tee, and the shot is almost identical, it should be very light. Does this make sense? I realize this might require more calculations, but I think it's important.
Gemini output was some garbage about a simple naive angle to two hole locations, rather than using the sophisticated expected value formula I'm using to calculate strokes-to-hole... thus worthless.
Follow up from me, including the course.py and the player.py files: I don't just want the angle, I want the angle between the optimal shot, given the dispersion pattern. We may need to update get_smart_aim in the player to return the vector it uses, and we may need to cache that info. We may need to update generate_strokes_gained_map in course to also return the vectors used. I'm really not sure. Take as much time as you need. I'd like a good idea to consider before actually implementing this.
Gemini output now has a helpful response about saving the vector field as we generate the different maps I'm trying to create as they are created. This is exactly the type of code I was looking for.
I recently started building a POC for an app idea. As framework I choose django and I did not once wrote code myself. The whole thing was done in a github codespace with copilot in agentic mode and using mostly sonnet and opus models. For prompting, I did not gave it specific instructions like add x to settings. I told it "We are now working on feature X. X should be able to do a, b and c. B has the following constraints. C should work like this." I have also some instructions in the agents.md file which tells the model to, before starting to code, ask me all unclear questions and then make a comprehensive plan on what to implement. I would then go over this plan, clarify or change if needed - and then let it run for 5-15 minutes. And every time it just did it. The whole thing, with debugging, with tests. Sure, sometimes there where minor bugs when I tested - but then I prompted directly the problem, and sure enough it got fixed in seconds...
Not sure why we had so different experiances. Maybe you are using other models? Maybe you miss something in your prompts? Letting it start with a plan which I can then check did definitly help a lot. Also a summary of the apps workings and technical decissions (also produced by the model) did maybe help in the long run.
I don't use VSCode, but I've heard that the default model isn't that great. I'd make sure you're using something like Opus 4.5/4.6. I'm not familiar enough with VSCode to know if it's somehow worse than Claude Code, even with the same models, but can test Claude Code to rule that out. It could also be you've stumbled upon a problem that the AI isn't that good at. For example, I was diagnosing a C++ build issue, and I could tell the AI was off track.
Most of the people that get wowed use an AI on a somewhat difficult task that they're unfamiliar with. For me, that was basically a duplicate of Apple's Live Captions that could also translate. Other examples I've seen are repairing a video file, or building a viewer for a proprietary medical imaging format. For my captions example, I don't think I would have put in the time to work on it without AI, and I was able to get a working prototype within minutes and then it took maybe a couple more hours to get it running smoother.
Also >20 years in software. The VSCode/autocomplete, regardless of the model, never worked good for me. But Claude Code is something else - it doesn't do autocomplete per se - it will do modifications, test, if it fails debug, and iterate until it gets it right.
Try Claude as others have said.
For Django try generating tests and test data. This works reasonably well for me even with fairly small local LLMs on my laptop.
I'm (mostly) a believer too, and I think AI makes using and improving these existing frameworks and libraries even easier.
You mentioned matplotlib, why does it make sense to pay for a bunch of AI agents to re-invent what matplotlib does and fix bugs that matplotlib has already fixed, instead of just having AI agents write code that uses it.
I mean, the thesis of the post is odd. I'll grant you that.
I work mostly with python (the vast majority is pure python), flask, and htmx, with a bit of vanilla js thrown in.
In a sense, I can understand the thesis. On the one hand Flask is a fantastic tool, with a reasonable abstraction given the high complexity. I wouldn't want to replace Flask. On the otherhand HTMX is a great tool, but often imperfect for what I'm exactly trying to do. Most people would say "well just just React!" except that I honestly loathe working with js, and unless someone is paying me, I'll do it in python. I could see working with an LLM to build a custom tool to make a version of HTMX that better interacts with Flask in the way I want it to.
In fact, in my project I'm working on now I'm building complex heatmap illustrations that require a ton of dataprocessing, so I've been building a model to reduce the NP hard aspects of that process. However, the illustrations are the point, and I've already had a back and forth with the LLM about porting the project into HTML, or some web based version of illustration at least, simply because I'd have much more control over the illustrations. Right now, matplotlib still suits me just fine, but if I had to port it, I could see just building my own tool instead of finding an existing framework and learning it.
Frameworks are mostly useful because of group knowledge. I learn Flask because I don't want to build all these tools from scratch, and because I makes me literate in a very common language. The author is suggesting that these barriers -- at least for your own code -- functionally don't exist anymore. Learning a new framework is about as labor intensive as learning one you're creating as you go. I think it's short-sighted, yes, but depending on the project, yea when it's trivial to build the tool you want, it's tempting to do that instead learning to use a similar tool that needs two adapters attached to it to work well on the job you're trying to do.
At the same time, this is about scope. Anyone throwing out React because they want to just "invent their own entire web framework" is just being an idiot.
Because frameworks don’t have bugs? Or unpredictable dependency interactions?
This is generous, to the say the least.
Well maintained, popular frameworks have github issues that frequently get resolved with newly patched versions of the framework. Sometimes bugs get fixed that you didn't even run into yet so everybody benefits.
Will your bespoke LLM code have that? Every issue will actually be an issue in production experienced by your customers, that will have to be identified (better have good logging and instrumentation), and fixed in your codebase.
Frameworks that are (relatively) buggy and slow to address bugs lose popularity, to the point that people will spontaneously create alternatives. This happened too many times.
> better to not have bugs in the first place
you must have never worked on any software project ever
Have you? Then you know that the amount of defects scales linearly with the amount of code. As things stand models write a lot more code than a skilled human for a given requirement.
In practice using someone else’s framework means you’re accepting the risk of the thousands of bugs in the framework that have no relevance to your business use case and will never be fixed.
Yet people still use frameworks, before and after the age of LLMs. Frameworks must have done something right, I guess. Otherwise everyone will vibe their own little React in the codebase.