← Back to context

Comment by afro88

1 day ago

> What this interaction shows is how much knowledge you need to bring when you interact with an LLM. The “one big flaw” Claude produced in the middle would probably not have been spotted by someone less experienced with crypto code than this engineer obviously is. And likewise, many people would probably not have questioned the weird choice to move to PBKDF2 as a response

For me this is the key takeaway. You gain proper efficiency using LLMs when you are a competent reviewer, and for lack of a better word, leader. If you don't know the subject matter as well as the LLM, you better be doing something non-critical, or have the time to not trust it and verify everything.

My question is kind of in this brave new world, where do the domain experts come from? Whose going to know this stuff?

  • LLMs make learning new material easier than ever. I use them a lot and I am learning new things at an insane pace in different domains.

    The maximalists and skeptics both are confusing the debate by setting up this straw man that people will be delegating to LLMs blindly.

    The idea that someone clueless about OAuth should develop an OAuth lib with LLM support without learning a lot about the topic is... Just wrong. Don't do that.

    But if you're willing to learn, this is rocket fuel.

    • On the flip side, I wanted to see what common 8 layer PCB stackups were yesterday. ChatGPT wasn't giving me an answer that really made sense. After googling a bit, I realized almost all of the top results were AI generated, and also had very little in the way of real experience or advice.

      It was extremely frustrating.

      14 replies →

    • >> LLMs make learning new material easier than ever.

      feels like there's a logical flaw here, when the issue is that LLMs are presenting the wrong information or missing it all together. The person trying to learn from it will experience Donald Rumsfield's "unknown unknowns".

      I would not be surprised if we experience an even more dramatic "Cobol Moment" a generation from now, but unlike that one thankfully I won't be around to experience it.

    • Learning from LLMs is akin to learning from Joe Rogan.

      You are getting a stylised view of a topic from an entity who lacks the deep understanding needed to be able to fully distill the information. But it is enough to gain enough knowledge for you to feel confident which is still valuable but also dangerous.

      And I assure you that many, many people are delegating to LLMs blindly e.g. it's a huge problem in the UK legal system right now because of all the invented case law references.

      3 replies →

    • how do you gain anything useful from a sycophantic tutor that agrees with everything you say, having being trained to behave as if the sun shines out of your rear end?

      making mistakes is how we learn, and if they are never pointed out...

      4 replies →

    • I think what’s missing here is you should start by reading the RFCs. RFCs tend to be pretty succinct so I’m not really sure what a summarization is buying you there except leaving out important details.

      (One thing that might be useful is use the LLM as a search engine to find the relevant RFCs since sometimes it’s hard to find all of the applicable ones if you don’t know the names of them already.)

      I really can’t stress this enough: read the RFCs from end to end. Then read through the code of some reference implementations. Draw a sequence diagram. Don’t have the LLM generate one for you, the point is to internalize the design you’re trying to implement against.

      By this time you should start spotting bugs or discrepancies between the specs and implementations in the wild. That’s a good sign. It means you’re learning

    • > LLMs make learning new material easier than ever. I use them a lot and I am learning new things at an insane pace in different domains.

      With learning, aren’t you exposed to the same risks? Such that if there was a typical blind spot for the LLM, it would show up in the learning assistance and in the development assistance, thus canceling out (i.e unknown unknowns)?

      Or am I thinking about it wrongly?

      8 replies →

    • The value of LLMs is that they do things for you, so yeah the incentive is to have them take over more and more of the process. I can also see a future not far into the horizon where those who grew up with nothing but AI are much less discerning and capable and so the AI becomes more and more a crutch, as human capability withers from extended disuse.

    •   Just wrong. Don’t do that
      

      I’d personally qualify this: don’t ship that code, but absolutely do it personally to grow if you’re interested.

      I’ve grown the most when I start with things I sort of know and I work to expand my understanding.

    • If the hypothesis is that we still need knowledgeable people to run LLMs, but the way you become knowledgeable is by talking to LLMs, then I don’t think the hypothesis will be correct for long..

      2 replies →

    • > But if you're willing to learn, this is rocket fuel.

      LLMs will tell you 1 or 2 lies for each 20 facts. Its a hard way to learn. They cant even get their urls right...

      10 replies →

    • Another limitation of LLMs lies in their inability to stay in sync with novel topics or recently introduced methods, especially when these are not yet part of their training data or can't be inferred from existing patterns.

      It's important to remember that these models depend not only on ML breakthroughs but also on the breadth and freshness of the data used to train them.

      That said, the "next-door" model could very well incorporate lessons from the recent Cloudflare OAuth Library issues, thanks to the ongoing discussions and community problem-solving efforts.

    • And yet, human coders may do that exact type of thing daily, producing far worse code. I find it humorous at how much higher of a standard is applied to LLMs in every discussion when I can guarantee those exact some coders likely produce their own bug-riddled software.

      We’ve gone from skeptics saying LLMs can’t code, to they can’t code well, to they can’t produce human-level code, to they are riddled with hallucinations, to now “but they can’t one-shot code a library without any bugs or flaws” and “but they can only one-shot code, they can’t edit well” even tho recents coding utilities have been proving that wrong as well. And still they say they are useless.

      Some people just don’t hear themselves or see how AI is constantly moving their bar.

      1 reply →

  • This, for me, has been the question since the beginning. I’m yet to see anyone talk/think about the issue head on too. And whenever I’ve asked someone about it, they’ve not had any substantial thoughts.

    • Engineers will still exist and people will vibe code all kinds of things into existence. Some will break in spectacular ways, some of those projects will die, some will hire a real engineer to fix things.

      I cannot see us living in a world of ignorance where there are literally zero engineers and no one on the planet understands what's been generated. Weirdly we could end up in a place where engineering skills are niche and extremely lucrative.

      1 reply →

  • Most important question on this entire topic.

    Fast forward 30 years and modern civilisation is entirely dependent on our AI’s.

    Will deep insight and innovation from a human perspective perhaps come to a stop?

    • Did musical creativity end with synths and sequencers?

      Tools will only amplify human skills. Sure, not everyone will choose to use tools for anything meaningful, but those people are driving human insight and innovation today anyway.

    • No. Even with power tools, construction and joinery are physical work and require strength and skill.

      What is new is that you'll need the wisdom to figure out when the tool can do the whole job, and where you need to intervene and supervise it closely.

      So humans won't be doing any less thinking, rather they'll be putting their thinking to work in better ways.

      1 reply →

    • No, but it'll become a hobby or artistic pursuit, just like running, playing chess, or blacksmithing. But I personally think it's going to take longer than 30 years.

  • The implication is that they are hoping to bridge the gap between current AI capabilities and something more like AGI in the time it takes the senior engineers to leave the industry. At least, that's the best I can come up with, because they are kicking out all of the bottom rings of the ladder here in what otherwise seems like a very shortsighted move.

  • Use it or lose it.

    Experts will become those who use llm to learn and not to write code for them or solve tasks for them so they can build that skill.

  • In a few years hopefully the AI reviewers will be far more reliable than even the best human experts. This is generally how competency progresses in AI...

    For example, at one point a human + computer would have been the strongest combo in chess, now you'd be insane to allow a human to critic a chess bot because they're so unlikely to add value, and statistically a human in the loop would be far more likely to introduce error. Similar things can be said in fields like machine vision, etc.

    Software is about to become much higher quality and be written at much, much lower cost.

    • My prediction is that for that to happen we’ll need to figure out a way to measure software quality in the way we can measure a chess game, so that we can use synthetic data to continue improving the models.

      I don’t think we are anywhere close to doing that.

      2 replies →

I’m puzzled when I hear people say ‘oh, I only use LLMs for things I don’t understand well. If I’m an expert, I’d rather do it myself.’

In addition to the ability to review output effectively, I find the more closely I’m able to describe what I want in the way another expert in that domain would, the better the LLM output. Which isn’t really that surprising for a statistical text generation engine.

  • I guess it depends. In some cases, you don't have to understand the black box code it gives you, just that it works within your requirements.

    For example, I'm horrible at math, always been, so writing math-heavy code is difficult for me, I'll confess to not understanding math well enough. If I'm coding with an LLM and making it write math-heavy code, I write a bunch of unit tests to describe what I expect the function to return, write a short description and give it to the LLM. Once the function is written, run the tests and if it passes, great.

    I might not 100% understand what the function does internally, and it's not used for any life-preserving stuff either (typically end up having to deal with math for games), but I do understand what it outputs, and what I need to input, and in many cases that's good enough. Working in a company/with people smarter than you tends to make you end up in this situation anyways, LLMs or not.

    Though if in the future I end up needing to change the math-heavy stuff in the function, I'm kind of locked into using LLMs for understanding and changing it, which obviously feels less good. But the alternative is not doing it at all, so another tradeoff I suppose.

    I still wouldn't use this approach for essential/"important" stuff, but more like utility functions.

  • That's why outsource most other things in our life though, why would it be different with LLMs?

    People don't learn how a car works before buying one, they just take it to a mechanic when it breaks. Most people don't know how to build a house, they have someone else build it and assume it was done well.

    I fully expect people to similarly have LLMs do what the person doesn't know how and assume the machine knew what to do.

    • > why would it be different with LLMs?

      Because LLMs are not competent professionals to whom you might outsource tasks in your life. LLMs are statistical engines that make up answers all the time, even when the LLM “knows” the correct answer (i.e., has the correct answer hidden away in its weights.)

      I don’t know about you, but I’m able to validate something is true much more quickly and efficiently if it is a subject I know well.

      3 replies →

I've found llms are very quick to add defaults, fallbacks, rescues–which all makes it very easy for code to look like it is working when it is not or will not. I call this out three different places in my CLAUDE.md trying to adjust for this, and still occasionally get.

I've been using an llm to do much of a k8s deployment for me. It's quick to get something working but I've had to constantly remind it to use secrets instead of committing credentials in clear text. A dangerous way to fail. I wonder if in my case this is caused by the training data having lots of examples from online tutorials that omit security concerns to focus on the basics.

  • > my case this is caused by the training data having

    I think it's caused by you not having a strong enough system prompt. Once you've built up a slightly reusable system prompt for coding or for infra work, where you bit by bit build it up while using a specific model (since different models respond differently to prompts), you end up getting better and better responses.

    So if you notice it putting plaintext credentials in the code, add to the system prompt to not do that. With LLMs you really get what you ask for, and if you miss to specify anything, the LLM will do whatever the probabilities tells it to, but you can steer this by being more specific.

    Imagine you're talking to a very literal and pedantic engineer who argues a lot on HN and having to be very precise with your words, and you're like 80% of the way there :)

    • Yes, you are definitely right on that. I still find it a concerning failure mode. That said, maybe it's no worse than a junior copying from online examples without reading all the text some the code which of course has been very common also.

  • > It's quick to get something working but I've had to constantly remind it to use secrets instead of committing credentials in clear text.

    This is going to be a powerful feedback loop which you might call regression to the intellectual mean.

    On any task, most training data is going to represent the middle (or beginning) of knowledge about a topic. Most k8s examples will skip best practices, most react apps will be from people just learning react, etc.

    If you want the LLM to do best practices in every knowledge domain (assuming best practices can be consistently well defined), then you have to push it away from the mean of every knowledge domain simultaneously (or else work with specialized fine tuned models).

    As you continue to add training data it will tend to regress toward the middle because that's where most people are on most topics.

Over time AI coding tools will be able to research domain knowledge. Current "AI Research" tools are already very good at it but they are not integrated with coding tools yet. The research could look at both public Internet as well as company documents that contain internal domain knowledge. Some of the domain knowledge is only in people's heads. That would need to be provided by the user.

  • I'd like to add a practical observation, even assuming much more capable AI in the future: not all failures are due to model limitations, sometimes it's about external [world] changes.

    For instance, I used Next.js to build a simple login page with Google auth. It worked great, even though I only had basic knowledge of Node.js and a bit of React.

    Then I tried adding a database layer using Prisma to persist users. That's where things broke. The integration didn't work, seemingly due to recent versions in Prisma or subtle breaking updates. I found similar issues discussed on GitHub and Reddit, but solving them required shifting into full manual debugging mode.

    My takeaway: even with improved models, fast-moving frameworks and toolchains can break workflows in ways that LLMs/ML (at least today) can't reason through or fix reliably. It's not always about missing domain knowledge, it's that the moving parts aren't in sync with the model yet.

    • Just close the loop and give it direct access to your console logs in chrome and node, then it can do the "full manual debugging" on its own.

      It's not perfect, and it's not exactly cheap, but it works.

You will always trust domain experts at some junction; you can't build a company otherwise. The question is: Can LLMs provide that domain expertise? I would argue, yes, clearly, given the development of the past 2 years, but obviously not on a straight line.