Comment by fragmede
2 days ago
Feel free to run your own tests and see if the magic phrases do or do not influence the output. Have it make a Todo webapp with and without those phrases and see what happens!
2 days ago
Feel free to run your own tests and see if the magic phrases do or do not influence the output. Have it make a Todo webapp with and without those phrases and see what happens!
That's not how it works. It's not on everyone else to prove claims false, it's on you (or the people who argue any of this had a measurable impact) to prove it actually works. I've seen a bunch of articles like this, and more comments. Nobody I've ever seen has produced any kind of measurable metrics of quality based on one approach vs another. It's all just vibes.
Without something quantifiable it's not much better then someone who always wears the same jersey when their favorite team plays, and swears they play better because of it.
These coding agents are literally Language Models. The way you structure your prompting language affect the actual output.
If you read the transformer paper, or get any book on NLP, you will see that this is not magic incantation; it's purely the attention mechanism at work. Or you can just ask Gemini or Claude why these prompts work.
But I get the impression from your comment that you have a fixed idea, and you're not really interested in understanding how or why it works.
If you think like a hammer, everything will look like a nail.
I know why it works, to varying and unmeasurable degrees of success. Just like if I poke a bull with a sharp stick, I know it's gonna get it's attention. It might choose to run away from me in one of any number of directions, or it might decide to turn around and gore me to death. I can't answer that question with any certainty then you can.
The system is inherently non-deterministic. Just because you can guide it a bit, doesn't mean you can predict outcomes.
5 replies →
Do you actively use LLMs to do semi-complex coding work? Because if not, it will sound mumbo-jumbo to you. Everyone else can nod along and read on, as they’ve experienced all of it first hand.
You've missed the point. This isn't engineering, it's gambling.
You could take the exact same documents, prompts, and whatever other bullshit, run it on the exact same agent backed by the exact same model, and get different results every single time. Just like you can roll dice the exact same way on the exact same table and you'll get two totally different results. People are doing their best to constrain that behavior by layering stuff on top, but the foundational tech is flawed (or at least ill suited for this use case).
That's not to say that AI isn't helpful. It certainly is. But when you are basically begging your tools to please do what you want with magic incantations, we've lost the fucking plot somewhere.
2 replies →