Comment by postalcoder
6 hours ago
One of the nice things about the "dumber" models (like GPT-4) was that it was good enough to get you really far, but never enough to complete the loop. It gave you maybe 90%. 20% of which you had to retrace -- so you had to do 30% of the tough work yourself, which meant manually learning things from scratch.
The models are too good now. One thing I've noticed recently is that I've stopped dreaming about tough problems, be it code or math. The greatest feeling in the world is pounding your head against a problem for a couple of days and waking up the next morning with the solution sketched out in your mind.
I don't think the solution is to be going full natty with things, but to work more alongside the code in an editor, rather than doing things in CLI.
The big issue I see coming is that leadership will care less and less about people, and more about shipping features faster and faster. In other words, those that are still learning their craft are fucked up.
The amount of context switching in my day-to-day work has become insane. There's this culture of “everyone should be able to do everything” (within reason, sure), but in practice it means a data scientist is expected to touch infra code if needed.
Underneath it all is an unspoken assumption that people will just lean on LLMs to make this work.
I think this is sadly going to be the case.
I also used to get great pleasure from the banging head and then the sudden revelation.
But that takes time. I was valuable when there was no other option. Now? Why would someone wait when an answer is just a prompt away.
You still have the system design skills, and so far, LLMs are not that good in this field.
They can give plausible architecture but most of the time it’s not usable if you’re starting from scratch.
When you design the system, you’re an architect not a coder, so I see no difference between handing the design to agents or other developers, you’ve done the heavy lifting.
In that perspective, I find LLMs quite useful for learning. But instead of coding, I find myself in long sessions back and forth to ask questions, requesting examples, sequence diagrams .. etc to visualise the final product.
I see this argument all the time, and while it sounds great on paper (you're an architect now, not a developer) people forget (or omit?) that a product needs far fewer architects than developers, meaning the workforce gets in fact trimmed down thanks to AI advancements.
I would also point out that a lot of real world problems don’t need a complex architecture. They just need to follow some well established patterns.
It is a pattern matching problem and that seems to me to be something AI is/will be particularly good at.
Maybe it won’t be the perfect architecture, or the most efficient implementation. But that doesn’t seem to have stopped many companies before.
you can now access similar models for way cheaper prices. grok 4.1 fast is around 10x cheaper but performs slightly better
Grok? You're OK giving money to elon musk?
Better than Palantir.
2 replies →
Idk i very much feel like Claude Code only ever gets me really far, but never there. I do use it a fair bit, but i still write a lot myself, and almost never use its output unedited.
For hobby projects though, it's awesome. It just really struggles to do things right in the big codebase at work.
This is what I am thinking about this morning. I just woke up, made a cup of coffee, read the financial news, and started exploring the code I wrote yesterday.
My first thought was that I can abstract what I wrote yesterday, which was a variation of what I built over the previous week. My second thought was a physiological response of fear that today is going to be a hard hyper focus day full of frustration, and that the coding agents that built this will not be able to build a modular, clean abstraction. That was followed by weighing whether it is better to have multiple one off solutions, or to manually create the abstraction myself.
I agree with you 100 percent that the poor performance of models like GPT 4 introduced some kind of regularization in the human in loop coding process.
Nonetheless, we live in a world of competition, and the people who develop techniques that give them an edge will succeed. There is a video about the evolution of technique in the high jump, the Western Roll, the Straddle Technique, and finally the Fosbury Flop. Using coding agents will be like this too.
I am working with 150 GB of time series data. There are certain pain points that need to be mitigated. For example, a different LLM model has to be coerced into analyzing or working with the data from a completely different approach in order to validate. That means instead of being 4x faster, each iteration is 4x faster, and it needs to be done twice, so it still is only 2x faster. I burned $400 in tokens in January. This cannot be good for the environment.
Timezone handling always has to be validated manually. Every exploration of the data is a train and test split. Here is the thing that hurts the most. The AI coding agents always show the top test results, not the test results of the top train results. Rather than tell me a model has no significant results, it will hide that and only present the winning outliers, which is misleading and, like the OP research suggests, very dangerous.
A lot of people are going to get burned before the techniques to mitigate this are developed.
Overfitting has always been a problem when working with data. Just because the barrier of entry for time series work is much lower does not mean that people developing the skill, whether using old school tools like ARIMA manually or having AI do the work, escape the problem of overfitting. The models will always show the happy, successful looking results.
Just like calculators are used when teaching higher math at the secondary level so basic arithmetic does not slow the process of learning math skills, AI will be used in teaching too. What we are doing is confusing techniques that have not been developed yet with not being able to acquire skills. I wrack and challenge my brain every day solving these problems. As millions of other software engineers do as well, the patterns will emerge and later become the skills taught in schools.
> The greatest feeling in the world is pounding your head against a problem for a couple of days and waking up the next morning with the solution sketched out in your mind.
And then you find out someone else had already solved it. So might as well use the Google 2.0 aka ChatGPT.
Well, this is exactly the problem. This tactic works until you get to a problem that nobody has solved before, even if it's just a relatively minor one that no one has solved because no one has tried to because it's so specific. If you haven't built up the skills and knowledge to solve problems, then you're stuck.
But to understand the solution from someone else, you would have to apply your mind to understand the problem yourself. Transferring the hard work of thinking to GPT will rob you of the attention you will need to understand the subject matter fully. You will be missing insights that would be applicable to your problem. This is the biggest danger of brain rot.