← Back to context

Comment by wpietri

5 days ago

One of the things I think is going on here is a sort of stone soup effect. [1]

Core to Ptacek's point is that everything has changed in the last 6 months. As you and I presume he agree, the use of off-the-shelf LLMs in code was kinda garbage. And I expect the skepticism he's knocking here ("stochastic parrots") was in fact accurate then.

But it did get a lot of people (and money) to rush in and start trying to make something useful. Like the stone soup story, a lot of other technology has been added to the pot, and now we're moving in the direction of something solid, a proper meal. But given the excitement and investment, it'll be at least a few years before things stabilize. Only at that point can we be sure about how much the stone really added to the soup.

Another counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say.

So I'm still skeptical of the hype. After all, the hype is basically the same as 6 months ago, even though now the boosters can admit the products of 6 months ago sucked. But I can believe we're in the middle of a revolution of developer tooling. Even so, I'm content to wait. We don't know the long term effects on a code base. We don't know what these tools will look like in 6 months. I'm happy to check in again then, where I fully expect to be again told: "If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing." At least until then, I'm renewing my membership in the Boring Technology Club: https://boringtechnology.club/

[1] https://en.wikipedia.org/wiki/Stone_Soup

> Core to Ptacek's point is that everything has changed in the last 6 months.

This was actually the only point in the essay with which I disagree, and it weakens the overall argument. Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you.

I wrote this comment elsewhere: https://news.ycombinator.com/item?id=44164846 -- Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions, but if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

  • >if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

    Maybe? Social proof doesn't mean much to me during a hype cycle. You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in. People are extremely good at fooling themselves. There are a lot of extremely smart people following all of the world's major religions, for example, and they can't all be right. And whatever else is going on here, there are a lot of very talented people whose fortunes and futures depend on convincing everybody that something extraordinary is happening here.

    I'm glad you have found something that works for you. But I talk with a lot of people who are totally convinced they've found something that makes a huge difference, from essential oils to functional programming. Maybe it does for them. But personally, what works for me is waiting out the hype cycle until we get to the plateau of productivity. Those months that you spent figuring out what worked are months I'd rather spend on using what I've already found to work.

    • The problem with this argument is that if I'm right, the hype cycle will continue for a long time before it settles (because this is a particularly big problem to have made a dent in), and for that entire span of time skepticism will have been the wrong position.

      10 replies →

    • > You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in.

      While I agree with the skepticism, what specifically is the stake here? Most code assists have usable plans in the $10-$20 range. The investors are apparently taking a much bigger risk than the consumer would be in a case like this.

      Aside from the horror stories about people spending $100 in one day of API tokens for at best meh results, of course.

      1 reply →

    • Dude. Claude Code has zero learning curve. You just open the terminal app in your code directory and you tell it what you want, in English. In the time you have spent writing these comments about how you don't care to try it now because it's probably just hype, you could have actually tried it and found out if it's just hype.

      4 replies →

  • > "Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you."

    Sure, but I would argue that the UX is the product, and that has radically improved in the past 6-12 months.

    Yes, you could have produced similar results before, manually prompting the model each time, copy and pasting code, re-prompting the model as needed. I would strenuously argue that the structuring and automation of these tasks is what has made these models broadly usable and powerful.

    In the same way that Apple didn't event mobile phones nor touchscreens nor OSes, but the specific combination of these things resulted in a product that was different in kind than what came before, and took over the world.

    Likewise, the "putting the LLM into a structured box of validation and automated re-prompting" is huge! It changed the product radically, even if its constituent pieces existed already.

    [edit] More generally I would argue that 95% of the useful applications of LLMs aren't about advancing the SOTA model capabilities and more about what kind of structured interaction environment we shove them into.

    • For sure! I mainly meant to say that people should not attribute the "6 more months until it's really good" point as just another symptom of unfounded hype. It may have taken effort to effectively use AI earlier, which somewhat justified the caution, but now it's significantly easier and caution is counter-productive.

      But I think my other point still stands: people will need to figure out for themselves how to fully exploit this technology. What worked for me, for instance, was structuring my code to be essentially functional in nature. This allows for tightly focused contexts which drastically reduces error rates. This is probably orthogonal to the better UX of current AI tooling. Unfortunately, the vast majority of existing code is not functional, and people will have to figure out how to make AI work with that.

      A lot of that likely plays into your point about the work required to make useful LLM-based applications. To expand a bit more:

      * AI is technology that behaves like people. This makes it confusing to reason about and work with. Products will need to solve for this cognitive dissonance to be successful, which will entail a combination of UX and guardrails.

      * Context still seems to be king. My (possibly outdated) experience has been the "right" context trumps larger context windows. With code, for instance, this probably entails standard techniques like static analysis to find relevant bits of code, which some tools have been attempting. For data, this might require eliminating overfetching.

      * Data engineering will be critical. Not only does it need to be very clean for good results, giving models unfettered access to the data needs the right access controls which, despite regulations like GDPR, are largely non-existent.

      * Security in general will need to be upleveled everywhere. Not only can models be tricked, they can trick you into getting compromised, and so there need to even more guardrails.

      A lot of these are regular engineering work that is being done even today. Only it often isn't prioritized because there are always higher priorities... like increasing shareholder value ;-) But if folks want to leverage the capabilities of AI in their businesses, they'll have to solve all these problems for themselves. This is a ton of work. Good thing we have AI to help out!

  • I don't think it's possible to understand what people mean by force multiplier re AI until you use it to teach yourself a new domain and then build something with that knowledge.

    Building a mental model of a new domain by creating a logical model that interfaces with a domain I'm familiar with lets me test my assumptions and understanding in real time. I can apply previous experience by analogy and verify usefulness/accuracy instantly.

    > Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions[...]

    Part of the hype problem is that describing my experience sounds like bullshit to anyone who hasn't gone through the same process. The rate that I pick up concepts well enough to do verifiable work with them is literally unbelievable.

Almost by definition, one should be skeptical about hype. So we’re all trying to sort out what is being sold to us.

Different people have different weird tendencies in different directions. Some people irrationally assume that things aren’t going to change much. Others see a trend and irrationally assume that it will continue on a trend line.

Synthesis is hard.

Understanding causality is even harder.

Savvy people know that we’re just operating with a bag of models and trying to choose the right combination for the right situation.

This misunderstanding is one reason why doomers, accelerations, and “normies” talk past each other or (worse) look down on each other. (I’m not trying to claim epistemic equivalence here; some perspectives are based on better information, some are better calibrated than others! I’m just not laying out my personal claims at this point. Instead, I’m focusing on how we talk to each other.)

Another big source of misunderstanding is about differing loci of control. People in positions of influence are naturally inclined to think about what they can do, who they know, and where they want to be. People farther removed feel relatively powerless and tend to hold onto their notions of stability, such as the status quo or their deepest values.

Historically, programmers have been quite willing to learn new technologies, but now we’re seeing widespread examples where people’s plasticity has limits. Many developers cannot (or are unwilling to) wrap their minds around the changing world. So instead of confronting the reality they find ways to deny it, consciously or subconsciously. Our perception itself is shaped by our beliefs, and some people won’t even perceive the threat because it is too strange or disconcerting. Such is human nature: we all do it. Sometimes we’re lucky enough to admit it.

  • I think "the reality", at least as something involving a new paradigm, has yet to be established. I'll note that I heard plenty of similar talk about how developers just couldn't adapt six months or more ago. Promoters now can admit those tools were in fact pretty bad, because they now have something else to promote, but at the time those not rawdogging LLMs were dinosaurs under a big meteor.

    I do of course agree that some people are just refusing to "wrap their minds around the changing world". But anybody with enough experience in tech can count a lot more instances of "the world is about to change" than "the world really changed". The most recent obvious example being cryptocurrencies, but there are plenty of others. [1] So I think there's plenty of room here for legitimate skepticism. And for just waiting until things settle down to see where we ended up.

    [1] E.g. https://www.youtube.com/watch?v=b2F-DItXtZs

    • Fair points.

      Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions. That is selection bias. Many predicted disruptive changes do occur.

      Most importantly, if one wants to be intellectually honest, one has to engage against a set of plausible arguments and scenarios. Debunking one particular company’s hyperbolic vision for the future might be easy, but it probably doesn’t generalize.

      It is telling to see how many predictions can seem obvious in retrospect from the right frame of reference. In a sense (or more than that under certain views of physics), the future already exists, the patterns already exist. We just have to find the patterns — find the lens or model that will help the messy world make sense to us.

      I do my best to put the hype to the side. I try to pay attention to the fundamentals such as scaling laws, performance over time, etc while noting how people keep moving the goalposts.

      Also wrt the cognitive bias aspect: Cryptocurrencies didn’t threaten to apply significant (if any) downward pressure on the software development labor market.

      Also, even cryptocurrency proponents knew deep down that it was a chicken and the egg problem: boosters might have said adoption was happening and maybe even inevitable, but the assumption was right out there in the open. It also had the warning signs of obvious financial fraud, money laundering, currency speculation, and ponzi scheming.

      Adoption of artificial intelligence is different in many notable ways. Most saliently, it is not a chicken and egg problem: it does not require collective action. Anyone who does it well has a competitive advantage. It is a race.

      (Like Max Tegmark and others, I view racing towards superintelligence as a suicide race, not an arms race. This is a predictive claim that can be debated by assessing scenarios, understanding human nature, and assigning probabilities.)

      2 replies →

    • > Promoters now can admit those tools were in fact pretty bad

      Relative to what came after, which noone could predict would be guaranteed?

      The Model T was in fact pretty bad relative to what came after...

      > because they now have something else to promote

      something else which is better?

      i don't understand the inherent cynicism here.

I’m an amateur coder and I used to rely on Cursor a lot to code when I was actively working on hobby apps about 6 months ago

I picked coding again a couple of days back and I’m blown away by how much things have changed

It was all manual work until a few months back. Suddenly, its all agents

  • > You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

    I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

    • My 80-year old dad tells me that when he bought his first car, he could pop open the hood and fiddle with things and maybe get it to work after a breakdown

      Now he can't - it's too closed and complicated

      Yet, modern cars are way better and almost never breakdown

      Don't see how LLMs are any different than any other tech advancement that obfuscates and abstracts the "fundamentals".

      1 reply →

    • Oops, this was a reply to somebody else put in the wrong place. Sorry for the confusion.

"nother counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say."

You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I wish i could impress this upon more people.

A friend similarly used to lament/complain that Kotlin sucked in part because we could have probably accomplished it's major features in Java, and maybe without tons of work, or migration cost.

This is maybe even true!

as an intellectual exercise, both are interesting to think about. But outside of that, people get caught up in this as if it matters, but it doesn't.

Basically nothing is driven by pure technical merit alone, not just in CS, but in any field. So my point to him was the lesson to take away from this is not "we could have been more effective or done it cheaper or whatever" but "my definition of effectiveness doesn't match how reality decides effectiveness, so i should adjust my definition".

As much as people want the definition to be a meritocracy, it just isn't and honestly, seems unlikely to ever be.

So while it's 100% true that billions of dollars dumped into other tools or approaches or whatever may have have generated good, better, maybe even amazing results, they weren't, and more importantly, never would have been. Unknown but maybe infinite ROI is often much more likely to see investment than more known but maybe only 2x ROI.

and like i said, this is not just true in CS, but in lots of fields.

That is arguably quite bad, but also seems unlikely to change.

  • > You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

    I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

    • Sure, and that works in the abstract (ie "what investment would theoretically have made the most sense") but if you are trying to compare in the real world you have to be careful because it assumes the alternative would have ever happened. I doubt it would have.