← Back to context

Comment by Falimonda

2 months ago

I fear this will not age well.

Which models have you tried to date? Can you come up with a top 3 ranking among popular models based on your definition of value?

What can be said about the ability of an LLM to translate your thinking represented in natural language to working code at rates exceeding 5-10x your typing speed?

Mark my words: Every single business that has a need for SWEs will obligate their SWEs to use AI coding assistants by the end of 2026, if not by the end of 2025. It will not be optional like it is today. Now is the time you should be exploring which models are better at "thinking" than others, and discerning which thinking you should be doing vs. which thinking you can leave up to ever-advancing LLMs.

I've had to yank tokens out of the mouths of too many thinking models stuck in loops of (internally, within their own chain of thought) rephrasing the same broken function over and over again, realizing each time that it doesn't meet constraints, and trying the same thing again. Meanwhile, I was sat staring at an opaque spinner wondering if it would have been easier to just write it myself. This was with Gemini 2.5 Pro for reference.

Drop me a message on New Year's Day 2027. I'm betting I'll still be using them optionally.

  • I've experienced gemini get stuck as you describe a handful of times. With that said, my predication is made on the observation that these tools are already force multipliers, and they're only getting better each passing quarter.

    You'll of course be free to use them optionally in your free time and on personal projects. It won't be the case at your place of employment.

    I will mark my calendar!

Every single business that has a need for SWEs will obligate their SWEs to use AI coding assistants by the end of 2026, if not by the end of 2025.

If businesses mandated speed like that, then we’d all have been forced to use emacs decades ago. Businesses mandate correctness and AI doesn’t as clearly help to that end.

  • There's nothing natural about using emacs in the way that an LLM can convert natural language to working code and productivity gains.

    For better or worse, you won't find correctness on any business' income statement. Sure, it's a latent variable, but so is efficiency.

This reminds me of the story a few days ago about "what is your best prompt to stump LLMs", and many of the second level replies were links to current chat transcripts where the LLM handled the prompt without issue.

I think there are a couple of problems at play: 1) people who don't want the tools to have value, for various reasons, and have therefore decided the tools don't have value; 2) people who tried the tools six months or a year ago and had a bad experience and gave up; and 3) people who haven't figured out how to make good use of the tools to improve their productivity (this one seems to be heavily impacted by various grifters who overstate what the coding assistants can do, and people underestimating the effort they have to put in to get good at getting good output from the models.)

  • 4) People that likes having reliable tools which frees them from "reviewing" the output of these tools to see if the tool didn't make an error.

    Using AI is like driving a car that decides to turn even if you keep the steering wheel straight. Randomly. At various degree. If you like this because some times it let you turn in a curve without you having to steer, you do you. But some people do prefer having a car turn when and only when they turn the wheel.

    • That's covered under point #1. I'm not claiming these tools are perfect. Neither are most people, but from the standpoint of an employer, the question is going to be "does the tool, after accounting for errors, make my employees more or less productive?" A lot of people are seeing the answer to that - today - is the tools offer a productivity advantage.