← Back to context

Comment by saulpw

1 day ago

> But you can't talk to them about the flow of the code. You can't ask them for their thinking as to why certain things are.

You can absolutely do this. It's even right most of the time.

Let's be real. Most of the time you ask an LLM "Why did you do it like this?", it responds with something along the lines of "Oops. My bad. You're right to point this out."

You even have a fair chance of getting a response like that when there isn't anything wrong and the question wasn't rhetorical - which perfectly illustrates the level of the genuine understanding LLMs operate at.

  • When you criticize AI, always remember that the alternative is the average employee. Today's models are pretty good.

    • A lot of people think they're above average. A lot of them are wrong.

      A lot of average people are producing gigantic messes. At least previous to this they were gated by their mediocrity.

    • > the alternative is the average employee. Today's models are pretty good.

      I have never seen anywhere in the world people that hates so much the working class as people do in the USA.

      In my country the average employee is competent, they do their work and create wealth for the nation.

      Again, only in the USA people think that billionaires are the ones creating value. Total non-sense indoctrination.

      1 reply →

    • To adequately validate work you must be at least at the same level, so if you were right (which dunning-kruger suggests unlikely) that would mean your "terrible" average employee is given a tool that will 10x their output which they cannot even check for correctness. And correctness will be low if the average employee is bad like you say, because it means they will give badly specified tasks and even with the best of us it's garbage in, garbage out. I am sure there is no way this can backfire.

      2 replies →

    • when you criticize the average employee, always remember that the alternative is the average employee with AI.

    • and have they totally got rid of the average employees? They can blame the models for the production outages already?

  • I remember hearing (perhaps last year?) that the model companies have specifically tried to obfuscate the "thinking/reasoning" behind the decisions the models make so as to prevent cheaper models from training on the reasoning logs. So asking one "why did you do it like this" might be not fruitful.

    Not sure if that's true or if it might be influencing what you're seeing, but it's a thought.

    • I think that has to do more with the thinking "train of thought" that some models show as what the model is processing before making the response. There shouldn't be a distillation risk with actually asking the model to explain why it made a decision and getting the response.

  • This has happened to me, so I put this in my global CLAUDE.md, and it seems to help (I don't remember getting the response you mentioned for awhile now):

        **Lead with the answer when asked how/which/whether.** Name the command/mechanism first; a question seeking understanding isn't a go-ahead to execute. Answer, then offer to act.

  • That's because of a fundamental misunderstanding of what an LLM is. The only correct answer to "Why did you do it like this?" is that the specific combination of input text and RNG state caused this particular output. There's no reasoning to be had.

    * EDIT * What's with the downvoting? That's a correct description of what happened. You can't ask an LLM why it did something and expect a coherent response, because there's no thinking chain, and no stored thinking state... At best, you can get a reconstruction of how the context relates to the output (basically a summarization of the context).

  • Can't remember the last time that happened.

    • Happened to me at least three times the past 14 days. I point out where it made a design decision that causes data loss. «Oops my mistake»

    • I encounter it constantly with the latest models. Claude is particularly prone to it.

      > I shouldn’t have said that with confidence

      > I got ahead of myself there

      > I overstepped, allow me to correct that

      It’s wild seeing how often it’s wrong, and I only know it’s wrong because I am an SME or actually reading the sources. Most of my coworkers are not SMEs with what they are asking and do not read the sources.

      A huge part of my job now is fixing fuck ups and failures resulting from these slop jockeys who have already moved on to slop up the next task.

And you can certainly tell it the flow you want (and any other constraints) in the prompt.