← Back to context

Comment by johnfn

3 days ago

The burden you are placing is too high here. Do you demand controlled trials for everything you do or else you refuse to use it or accept that other people might see productivity gains? Do you demand studies showing that static typing is productive? Syntax highlighting? IDEs or Vim? Unit testing? Whatever language you use?

Obviously not? It would be absurd to walk into a thread about Rust and say “Rust doesn’t increase your productivity and unless you can produce a study proving it does then your own personal anecdotes are worthless.”

Why the increased demand for rigor when it comes to AI specifically?

Typically I hear how other people are doing things and I test it out for myself. Just like I'm doing with AI

Actually IDEs vs vim are a perfect analogy because they both have the ability to feel like they're helping a tonne, and at the end of the work day neither group outperforms the other

I'm not standing on the sidelines criticizing this stuff. I'm using it. I'm growing more and more skeptical because it's not noticably helping me deliver features faster

At this point I'm at "okay record a video and show me these 3x gains you're seeing because I'm not experiencing the same thing"

The increased demand for rigor is because my experience isn't matching what others say

I can see a 25% bump in productivity being realistic if I learn where it works well. There are people claiming 3-10x. It sounds ridiculous

  • I canzt see a 25% jump in productivity because writting code isn't even 25% of what I do. Even if it was infitiely fast I still can't get that high.

    • Given a 25% hypothetical boost: there are categories of errors vibe testing vibed code will bring in, we know humans suck at critical reading. On the support timeline of an Enterprise product that’s gonna lead to one or more true issues.

      At what point is an ‘extra’ 25% coding overhead worth it to ensure a sane human reasonably concerned about criminal consequences for impropriety read all code when making it, and every change around it? To prevent public embarrassment that can and will chase off customers? To have someone to fire and sue if need be?

      [Anecdotally, the inflection point was finding tests updated to short circuit through mildly obfuscated code (introduced after several reviews). Paired with a working system developed with TDD, that mistake only becomes obvious when the system stops working but the tests don’t. I wrote it, I ran the agents, I read it, I approved it, but was looking for code quality not intentional sabotage/trickery… lesson learned.]

      From a team lead perspective in an Enterprise space, using 25% more time on coding to save insane amounts of aggressive and easy to flubb review and categories of errors sounds like a smart play. CYA up front, take the pain up front.

      1 reply →

  • Anecdotally the people who seem to be most adamant about the efficiency of things like vim or Python are some of the slowest engineers I've worked with when it comes to getting shit done. Even compared to people who don't really care for their preferred tech much lol.

    I wonder how many 10x AI bros were 1/10th engineers slacking off most of the week before the fun new tech got them to actually work on stuff.

    Obviously not all, and clearly there are huge wins to be had with AI. But I wonder sometimes..

I honestly wish we had studies that truly answered these Qs. Modern programming has been a cargo cult for a good 20 years now.

Do you just believe everything everybody says? No quantifiable data required, as long as someone somewhere says it it must be true?

One of the reasons software is in decline is because it's all vibes, nobody has much interest in conducting research to find anything out. It doesn't have to be some double blinded peer reviewed meta analysis, the bar can still be low, it just should be higher than "I feel like"...