Comment by belter
1 day ago
>> From where I’m standing, it’s scary.
You are being fooled by randomness [1]
Not because the models are random, but because you are mistaking a massive combinatorial search over seen patterns for genuine reasoning. Taleb point was about confusing luck for skill. Dont confuse interpolation for understanding.
You can read a Rust book after years of Java, then go build software for an industry that did not exist when you started. Ask any LLM to write a driver for hardware that shipped last month, or model a regulatory framework that just passed... It will confidently hallucinate. You will figure it out. That is the difference between pattern matching and understanding.
I've worked with a lot of interns, fresh outs from college, overseas lowest bidders, and mediocre engineers who gave up years ago. All over the course of a ~20 year career.
Not once in all that time has anyone PRed and merged my completely unrelated and unfinished branch into main. Except a few weeks ago. By someone who was using the LLM to make PRs.
He didn't understand when I asked him about it and was baffled as to how it happened.
Really annoying, but I got significantly less concerned about the future of human software engineering after that.
Have you used an LLM specifically trained for tool calling, in Claude Code, Cursor or Aider?
They’re capable of looking up documentation, correcting their errors by compiling and running tests, and when coupled with a linter, hallucinations are a non issue.
I don’t really think it’s possible to dismiss a model that’s been trained with reinforcement learning for both reasoning and tool usage as only doing pattern matching. They’re not at all the same beasts as the old style of LLMs based purely on next token prediction of massive scrapes of web data (with some fine tuning on Q&A pairs and RLHF to pick the best answers).
I'm using Claude code to help me learn Godot game programming.
One interesting thing is that Claude will not tell me if I'm following the wrong path. It will just make the requested change to the best of its ability.
For example a Tower Defence game I'm making I wanted to keep turret position state in an AStarGrid2D. It produced code to do this, but became harder and harder to follow as I went on. It's only after watching more tutorials I figured out I was asking for the wrong thing. (TileMapLayer is a much better choice)
LLMs still suffer from Garbage in Garbage out.
don't use LLMs for Godot game programming.
edit: Major engine changes have occurred after the models were trained, so you will often be given code that refers to nonexistent constants and functions and which is not aware of useful new features.
before coding I just ask the model "what are the best practices in this industry to solve this problem? what tools/libraries/approaches people use?
after coding I ask it "review the code, do you see any for which there are common libraries implementing it? are there ways to make it more idiomatic?"
you can also ask it "this is an idea on how to solve it that somebody told me, what do you think about it, are there better ways?"
3 replies →
Ask a model to
"Write a chess engine where pawns move backward and kings can jump like nights"
It will keep slipping back into real chess rules. It learned chess, it did not understand the concept of "rules"
Or
Ask it to reverse a made up word like
"Reverse the string 'glorbix'"
It will get it wrong on the first try. You would not fail.
Or even better ask it to...
"Use the dxastgraphx library to build a DAG scheduler."
dxastgraphx is a non existing library...
Marvel at the results...tried in both Claude and ChatGPT....
I‘ve just tried the dxastgraphx one in pi with Opus 4.5. This was its response:
Why would I ask the model to reverse the string 'glorbix,' especially in the context of software engineering?
just tried to reverse the string you provided using Gemini. it worked fine on the first try
2 replies →
You’re trying to interrogate a machine as you would a human and presenting this as evidence that machines aren’t humans. Yes, you’re absolutely right! And also completely missing the point.
Why would you expect an LLM or even a human to succeed in these cases? “Write a piece of code for a specification that you can’t possibly know about?” That’s why you have to do context engineering, just like you’d provide a reference to a new document to an engineer writing code.
This is exactly what happened to me: novel or uncommon = hallucinate or invent wrong.
It is ok for getting snippets for example and saying (I did it). Please make this MVVM style. It is not perfect, but saves time.
For very broad or novel reasoning, as of today... forget it.