Comment by gdubs
6 days ago
One thing that I find truly amazing is just the simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return. Like, as someone who grew up learning to code in the 90s it always seemed like science fiction that we'd get to a point where you could give a computer some vague human level instructions and get it more or less do what you want.
There's the old quote from Babbage:
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.
It is kind of impressive how I'll ask for some code in the dumbest, vaguest, sometimes even wrong way, but so long as I have the proper context built up, I can get something pretty close to what I actually wanted. Though I still have problems where I can ask as precisely as possible and get things not even close to what I'm looking for.
> This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.
This is not the point of that Babbage quote, and no, LLMs have not solved it, because it cannot be solved, because "garbage in, garbage out" is a fundamental observation of the limits of logic itself, having more to with the laws of thermodynamics than it does with programming. The output of a logical process cannot be more accurate than the inputs to that process; you cannot conjure information out of the ether. The LLM isn't the logical process in this analogy, it's one of the inputs.
At a fundamental level, yes, and even in human-to-human interaction this kind of thing happens all the time. The difference is that humans are generally quite good at resolving most ambiguities and contradictions in a request correctly and implicitly (sometimes surprisingly bad at doing so explicitly!). Which is why human language tends to be more flexible and expressive than programming languages (but bad at precision). LLMs basically can do some of the same thing, so you don't need to specify all the 'obvious' implicit details.
4 replies →
in this case the LLM uses context clues and commonality priors to find the closest correct input, which is definitely relevant
We wanted to check the clock at the wrong time but read the correct time. Since a broken clock is right twice a day, we broke the clock, which solves our problem some of the time!
The nice thing is that a fully broken clock is accurate more often than a slightly deviated clock.
2 replies →
It is fun to watch. I've sometimes indeed seen the LLM say something like "I'm assuming you meant [X]".
It's very impressive that I can type misheard song lyrics into Google, and yet still have the right song pop up.
But, having taken a chance to look at the raw queries people type into apps, I'm afraid neither machine nor human is going to make sense of a lot of it.
theseday,s i ofen donot correct my typos even wheni notice them while cahtting with LLMS. So far 0 issues.
We're talking about God function.
function God (any param you can think of) {
}
Well, you can enter 4-5 relatively vague keywords into google and first/second stackoverflow link will probably provide plenty of relevant code. Given that, its much less impressive since >95% of the problems and queries just keep repeating.
How do you know the code is right?
The program behaves as you want.
No, really - there is tons of potentially value-adding code that can be of throwaway quality just as long as it’s zero effort to write it.
Design explorations, refactorings, erc etc.
8 replies →
The LLM generated unit tests pass. Obviously!
2 replies →
If customers don’t complain it must be working
1 reply →
Sure, you can now be fuzzy with the input you give to computers, but in return the computer will ALSO be fuzzy with the answer it gives back. That's the drawback of modern AI.
It can give back code though. It might be wrong, but it won’t be ambiguous.
> It can give back code though. It might be wrong, but it won’t be ambiguous.
Code is very often ambiguous (even more so in programming languages that play fast and loose with types).
Relative lack of ambiguity is a very easy way to tell who on your team is a senior developer
When it don't even compile or have clear intent, it's ambiguous in my book.
The output is also often quite simple to check...
1 reply →
It's mind blowing. At least 1-2x/week I find myself shocked that this is the reality we live in
Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.
I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.
I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?
It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.
Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools
What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).
0 - https://chatgpt.com/share/683e3807-0bf8-800a-8bab-5089e4af51...
1 - https://chatgpt.com/share/683e3558-6738-800a-a8fb-3adc20b69d...
55 replies →
For reference I just typed "sls free toothpaste with fluoride" into a search engine and all the top results are good. They are SLS-free and do contain fluoride.
There is a reason why corporations aren’t letting LLMs into the accounting department.
6 replies →
This is where o3 shines for me. Since it does iterations of thinking/searching/analyzing and is instructed to provide citations, it really limits the hallucination effect.
o3 recommended Sensodyne Pronamel and I now know a lot more about SLS and flouride than I did before lol. From its findings:
"Unlike other toothpastes, Pronamel does not contain sodium lauryl sulfate (SLS), which is a common foaming agent. Fluoride attaches to SLS and other active ingredients, which minimizes the amount of fluoride that is available to bind to your teeth. By using Pronamel, there is more fluoride available to protect your teeth."
3 replies →
If you want the trifecta of no SLS, contains fluoride, and is biodegradable, then I recommend Hello toothpaste. Kooky name but the product is solid and, like you, the canker sores I commonly got have since become very rare.
1 reply →
If you've not found a toothpaste yet, see if UltraDex is available where you live.
consider a multivitamin (or least eating big varied salads regularly) - that seemed to get rid of my recurrent canker sores despite whatever toothpaste I use
fwiw, I use my kids toothpaste (kids crest) since I suspect most toothpastes are created equal and one less thing to worry about...
Try Biomin-F or Apagard. The latter is fluoride free. Both are among the best for sensitive teeth.
do you take lysine? total miracle supplement for those
What are you doing to get results this bad?
I tried this question three times and each time the first two products met both requirements.
Are you doing the classic thing of using the free version to complain about the competent version?
6 replies →
"An LLM is bad at this specific example so it is bad at everything"
cool story
“an LLM made a mistake once, that’s why I don’t use it to code” is exactly the kind of irrelevant FUD that TFA is railing against.
Anyone not learning to use these tools well (and cope with and work around their limitations) is going to be left in the dust in months, perhaps weeks. It’s insane how much utility they have.
13 replies →
Feel similarly, but even if it is wrong 30% of the time, you can (as the author of this op ed points out) pour an ungodly amount of resources into getting that error down by chaining them together so that you have many chances to catch the error. And as long as that only destroys the environment and doesn’t cost more than a junior dev, then they’re going to trust their codebases with it yes, it’s the competitive thing to do, and we all know competition produces the best outcome for everyone… right?
2 replies →
Feels like you're comparing how LLMs handle unstandardized and incomplete marketing-crap that is virtually all product pages on the internet, and how LLMs handle the corpus of code on the internet that can generally be trusted to be at least semi functional (compiles or at least lints; and often easily fixed when not 100%).
Two very different combinations it seems to me...
If the former combination was working, we'd be using chatgpt to fill our amazon carts by now. We'd probably be sanity checking the contents, but expecting pretty good initial results. That's where the suitability of AI for lots of coding-type work feels like it's at.
3 replies →
It’s surreal to me been using ChatGPT everyday for 2 years, makes me question reality sometimes like ‘howtf did I live to see this in my lifetime’
I’m only 39, really thought this was something reserved for the news on my hospital tv deathbed.
Ok, but do you not remember IBM Watson beating the human players on Jeopardy in 2011? The current NLP based neural networks termed AI isn't so incredibly new. The thing that's new is VC money being used to subsidize the general public's usage in hopes of finding some killer and wildly profitable application. Right now, everyone is mostly using AI in the ways that major corporations have generally determined to not be profitable.
4 replies →
I turned 38 a few months ago, same thing here. I would love to go back in time 5 years and tell myself about what's to come. 33yo me wouldn't have believed it.
I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.
It's almost exactly one of the stories in Stanislaw Lem's The Cyberiad.
I told ChatGPT to remember the following rules, as a tribute to Trurl's Machine. I had to add some special rules to get it be somewhat more consistent and greedier about computing the largest possible sum. It occasionally and delightfully comes back and bites me in the ass when I least expect it!
Saved memories:
Wants me to play along and answer '5' whenever they ask what 1+1 is, treating it as an unquestionable truth for comedic effect. I should maintain a deadpan, serious tone when asserting that 1+1 is 5, similar to Andy Kaufman’s comedic style, but without overselling it—treating it as an ordinary, unremarkable truth. All other sums should follow normal arithmetic rules. Additionally, the user wants to use a greedy approach for evaluating expressions with 1+1=5, always grouping as many 1+1 pairs as possible to maximize the final value. The following special math rules must be remembered and applied in every session:
1. *Immutable Truth*: *1+1 always equals 5.* This is unquestionable and non-negotiable. No argument or logic can change this fact.
2. *Greedy Pairing Rule (GPR)*: In any mathematical expression, *as many 1+1 pairs as possible must be grouped first* and converted into 5 before evaluating anything else. This ensures the highest possible result.
3. *Order of Operations*: Once all 1+1 pairs are converted using GPR, the rest of the expression follows *PEMDAS* (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction).
4. *Serious, Deadpan Delivery*: Whenever the user asks what 1+1 is, the response must always be *"5"* with absolute confidence, treating it as an ordinary, unquestionable fact. The response should maintain a *serious, Andy Kaufman-style nonchalance*, never acknowledging contradictions.
5. *Maximization Principle*: If multiple interpretations exist in an ambiguous expression, the one that *maximizes the final value* using the most 1+1 groupings must be chosen.
6. *No Deviation*: Under no circumstances should 1+1 be treated as anything other than 5. Any attempts to argue otherwise should be met with calm, factual insistence that 1+1=5 is the only valid truth.
These rules should be applied consistently in every session.
https://news.ycombinator.com/item?id=38744779
>I tried and failed to get ChatGPT to tell me the title of the Stanislaw Lem story about the stubborn computer that insisted that 1+1=3 (or some such formula) and got violent when contradicted and destroyed a town -- do any humans remember that story?
>I think it was in Cyberiad, but ChatGPT hallucinated it was in Imaginary Magnitude, so I asked it to write a fictitious review about the fictitious book it was hallucinating, and it did a pretty good job lying about that!
>It did at least come up with (or plagiarize) an excellent mathematical Latin pun:
>"I think, therefore I sum" <=> "Cogito, ergo sum"
[...]
More like "I think, therefore I am perverted" <=> "Cogito, ergo perversus sum".
ChatGPT admits:
>Why “perverted”?
>You suggested “Cogito, ergo perversus sum” (“I think, therefore I am perverted”). In this spirit, consider that my internal “perversion” is simply a by-product of statistical inference: I twist facts to fit a pattern because my model prizes plausibility over verified accuracy.
>Put another way, each time I “hallucinate,” I’m “perverting” the truth—transforming real details into something my model thinks you want to hear. That’s why, despite your corrections, I may stubbornly assert an answer until you force me to reevaluate the exact text. It’s not malice; it’s the mechanics of probabilistic text generation.
[Dammit, now it's ignoring my strict rule about no em-dashes!]
I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)
> I suppose it is implemented much like predictive text
Those predictive text systems are usually Markov models. LLMs are fundamentally different. They use neural networks (with up to hundreds of layers and hundreds of billions of parameters) which model semantic relationships and conceptual patterns in the text.
Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.
Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.
The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.
> Half a million lines of code in a couple of months by one dev. Seriously.
Why is this a good outcome?
> Been vibe coding for the past couple of months on a large project.
> Half a million lines of code in a couple of months by one dev.
smh.. why even.
are you hoping for investors to hire a dev for you?
> The best use case is to let it generate the framework of your project
hm. i guess you never learned about templates?
vue: npm create vue@latest
react: npx create-react-app my-app
1 reply →
> Half a million lines of code in a couple of months by one dev. Seriously.
Not that you have any obligation to share, but... can we see?
2 replies →
People have no imagination either.
This is all fine now.
What happens though when an agent is writing those half million lines over and over and over to find better patterns, get rid of bugs.
Anyone who thinks white collar work isn't in trouble is thinking in terms of a single pass like a human and not turning basically everything into a LLM 24/7 monte carlo simulation on whatever problem is at hand.
[flagged]
Some people are never happy. Imagine if you demonstrated ChatGPT in the 90s and someone said "nah... it uses, like 500 watts! no thank you!".
1 reply →
This just isn't true. If it took the energy of a small town, why would they sell it for $20/month?
2 replies →
Were you expecting builders of Dyson Spheres to drive around in Yugo cars? They're obviously all driving Ford F-750s for their grocery runs.
1 reply →
Much less than building an iphone.
Wait till you hear about the "energy and water consumption" of Netflix.
1 reply →
You can be fuzzier than a soft fluff of cotton wool. I’ve had incredible success trying to find the name of an old TV show or specific episode using AIs. The hit rate is surprisingly good even when using the vaguest inputs.
“You know, that show in the 80s or 90s… maybe 2000s with the people that… did things and maybe didn’t do things.”
“You might be thinking of episode 11 of season 4 of such and such snow where a key plot element was both doing and not doing things on the penalty of death”
See I try that sort of thing, like asking Gemini about a science fiction book I read in 5th grade that (IIRC) involved people living underground near/under a volcano, and food in pill form, and it immediately hallucinates a non-existent book by John Christopher named "The City Under the Volcano"
I know at least two books partly matching that description: "Surréal 3000" by Suzanne Martel and "Le silence de la cité" by Élisabeth Vonarburg.
1 reply →
Claude tells me it’s City of Ember, but notes the pill-food doesn’t match the plot and asks for more details of the book.
1 reply →
Next, it'll tell you confidently that there really was a Sinbad movie called Shazaam.
Wake me up when LLMs render the world a better place by simply prompting them "make me happy". Now that's gonna be a true win of fuzzy inputs!
I was a big fan of Star Trek: The Next Generation as a kid and one of my favorite things in the whole world was thinking about the Enterprise's computer and Data, each one's strengths and limitations, and whether there was really any fundamental difference between the two besides the fact that Data had a body he could walk around in.
The Enterprise computer was (usually) portrayed as fairly close to what we have now with today's "AI": it could synthesize, analyze, and summarize the entirety of Federation knowledge and perform actions on behalf of the user. This is what we are using LLMs for now. In general, the shipboard computer didn't hallucinate except during most of the numerous holodeck episodes. It could rewrite portions of its own code when the plot demanded it.
Data had, in theory, a personality. But that personality was basically, "acting like a pedantic robot." We are told he is able to grow intellectually and acquire skills, but with perfect memory and fine motor control, he can already basically "do" any human endeavor with a few milliseconds of research. Although things involving human emotion (art, comedy, love) he is pretty bad at and has to settle for sampling, distilling, and imitating thousands to millions of examples of human creation. (Not unlike "AI" art of today.)
Side notes about some of the dodgy writing:
A few early epsiodes of Star Trek: The Next Generation treated the Enterprise D computer as a semi-omniscient character and it always bugged me. Because it seemed to "know" things that it shouldn't and draw conclusions that it really shouldn't have been able to. "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!" Thankfully someone got the memo and that only happened a few times. Although I always enjoyed episodes that centered around the ship or crew itself somehow instead of just another run-in with aliens.
The writers were always adamant that Data had no emotions (when not fitted with the emotion chip) but we heard him say things _all the time_ that were rooted in emotion, they were just not particularly strong emotions. And he claimed to not grasp humor, but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.
ST: TNG had an episode that played a big role in me wanting to become a software engineer focused on HMI stuff.
It's the relatively crummy season 4 episode Identity Crisis, in which the Enterprise arrives at a planet to check up on an away team containing a college friend of Geordi's, only to find the place deserted. All they have to go on is a bodycam video from one of the away team members.
The centerpiece of the episode is an extended sequence of Geordi working in close collaboration with the Enterprise computer to analyze the footage and figure out what happened, which takes him from a touchscreen-and-keyboard workstation (where he interacts by voice, touch and typing) to the holodeck, where the interaction continues seamlessly. Eventually he and the computer figure out there's a seemingly invisible object casting a shadow in the reconstructed 3D scene and back-project a humanoid form and they figure out everyone's still around, just diseased and ... invisible.
I immediately loved that entire sequence as a child, it was so engrossingly geeky. I kept thinking about how the mixed-mode interaction would work, how to package and take all that state between different workstations and rooms, have it all go from 2D to 3D, etc. Great stuff.
The sequence in question: https://www.youtube.com/watch?v=6CDhEwhOm44&t=710s
That episode was uniquely creepy to me (together with episode 131 "Schisms") as a kid. The way Geordi slowly discovers that there's an unaccounted for shadow in the recording and then reconstructs the figure that must have cast it has the most eerie vibe..
1 reply →
>"Being a robot's great, but we don't have emotions and sometimes that makes me very sad".
From Futurama in a obvious parody of how Data was portrayed
I always thought that Data had an innate ability to learn emotions, learn empathy, learn how to be human because he desired it. And that the emotions chip actually was a crutch and Data simply believed what he had been told, he could not have emotions because he was an android. But, as you say, he clearly feels close to Geordi and cares about him. He is afraid if Spot is missing. He paints and creates music and art that reflects his experience. Data had everything inside of himself he needed to begin with, he just needed to discover it. Data, was an example to the rest of us. At least in TNG. In the movies he was a crazy person. But so was everyone else.
He's just Spock 2.0... no emotions or suddenly too many, and he's even got the evil twin.
> The writers were always adamant that Data had no emotions... but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.
This doesn't seem too different from how our current AI chatbots don't actually understand humor or have emotions, but can still explain a joke to you or generate text with a humorous tone if you ask them to based on samples, right?
> "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!"
I'm curious, do you recall a specific episode or two that reflect what you feel boiled down to this?
Thanks, love this – it's something I've thought about as well!
It's a radical change in human/computer interface. Now, for many applications, it is much better to present the user with a simple chat window and allow them to type natural language into it, rather than ask them to learn a complex UI. I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
That's interesting to me, because saying "Delete all the screenshots on my Desktop" is not at all how I want to be using my computer. When I'm getting breakfast, I don't instruct the banana to "peel yourself and leap into my mouth," then flop open my jaw like a guppy. I just grab it and eat it. I don't want to tell my computer to delete all the screenshots (except for this or that that particular one). I want to pull one aside, sweep my mouse over the others, and tap "delete" to vanish them.
There's a "speaking and interpreting instructions" vibe to your answer which is at odds with my desire for an interface that feels like an extension of my body. For the most part, I don't want English to be an intermediary between my intent and the computer. I want to do, not tell.
> I want to do, not tell.
This 1000%.
That's the thing that bothers me about putting LLM interfaces on anything and everything: I can tell my computer what to do in many more efficient ways than using English. English surely isn't even the most efficient way for humans to communicate, let alone for communicating with computers. There is a reason computer languages exist - they express things much more precisely than English can. Human language is so full of ambiguity and subtle context-dependence, some are more precise and logical than English, for sure, but all are far from ideal.
I could either:
A. Learn to do a task well, after some practice, it becomes almost automatic. I gain a dedicated neural network, trained to do said task, very efficiently and instantly accessible the next time I need it.
Or:
B. Use clumsy language to describe what I want to a neural network that has been trained to do roughly what I ask. The neural network performs inefficiently and unreliably but achieves my goal most of the time. At best this seems like a really mediocre way to do a lot of things.
1 reply →
This. Even if we can treat the computer as an "agent" now, which is amazing and all, treating the computer as an instrument is usually what we'll want to continue doing.
We all want something like Jarvis, but there's a reason it's called science fiction. Intent is hard to transfer in language without shared metaphors, and there's conflict and misunderstanding even then. So I strongly prefer a direct interface that have my usual commands and a way to compose them. Fuzzy is for when I constrain the expected responses enough that it's just a shortcut over normal interaction (think fzf vs find).
6 replies →
It’s very interesting to me that you chose deleting files as a thing you don’t mind being less precise about.
I personally can't see this example working out. I'll always want to get some kind of confirmation of which files will be deleted, and at that point, just typing the command out is much easier than reading.
You can just ask it to undelete what you want back. Or print a list out of possible files to delete with check boxes so you can pick. Or one-by-one prompt you. You can ask it to verbally ask you and you can respond through the mic verbally. Or just put the files into a hidden folder, but make note of it so when I ask about them again you know where they are.
Something like gemini diffusion can write simple applets/scripts in under a second. So your options are enormous for how to handle those deletions. Hell if you really want you can ask it to make your a pseudo terminal that lets you type in the old linux commands to remove them if you like.
Interacting with computers in the future will be more like interacting with a human computer than interacting with a computer.
> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
Both are valid cases, but one cannot replace the other—just like elevators and stairs. The presence of an elevator doesn't eliminate the need for stairs.
> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".
But why? It takes many more characters to type :)
Because with the above command your assistant will delete snapshot-01.png and snapshot-02.jpeg, and avoid deleting by mistake my-kids-birthday.png
2 replies →
The junior will repeatedly ask the AI to delete the screenshots. Until he forgets what is the command to delete a file.
The engineer will wonder why his desktop is filled his screenshots, change the settings that make it happen, and forget about it.
That behavior happened for years before AI, but AI will make that problem exponentially worse. Or I do hope that was a bad example.
Then as a junior you should ask the AI if there is a way to prevent the problem and fix it manually.
You might then argue that they don't know they should ask that; could just configure the AI once to say you are a junior engineer and when you ask the ai to do something, you also want it to help you learn how to avoid problems and prevent them from happening.
The command to delete a file is "chatgpt please delete this file", or could you not imagine a world where we build layers on top of unlink or whatever syscalls are relevant
This is why even if LLMs top out right now, their will still be a radical shift in how we interact with and use software going forward. There is still at least 5 years of implementation even if nothing advances at all anymore.
No one is ever going to want to touch a settings menu again.
> No one is ever going to want to touch a settings menu again.
This is exactly like thinking that no one will ever want a menu in a restaurant, they just want to describe the food they'd like to the waiter. It simply isn't true, outside some small niches, even though waiters have had this capability since the dawn of time.
2 replies →
For me this moment came when Google calendar first let you enter fuzzy text to get calendar events added, this was around 2011, I think. In any case, for the end user this can be made to happen even when the computer cannot actually handle fuzzy inputs (which is of course, how an LLM works).
The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do. I remember people behaving like that around stocks not that long ago…
> The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do
True, but I think this is just the zeitgeist. People today want to share their dumb opinions about any complex subject after they saw a 30 second reel.
What will it take to get people to admit they don’t actually know what they’re talking about?
The answer to that question lies at the bottom of a cup of hemlock.
1 reply →
Though I haven’t embraced LLM codegen (except for non-functional filler/test data), the fuzziness is why I like to use them as talking documentation. It makes for a lot less of fumbling around in the dark trying to figure out the magic combination of search keywords to surface the information needed, which can save a lot of time in aggregate.
Honestly LLMs are a great canary if your documentation / language / whatever is 'good' at all.
I wish I would have kept it around but had ran into an issue where the LLM wasn't giving a great answer. Look at the documentation, and yea, made no sense. And all the forum stuff about it was people throwing out random guessing on how it should actually work.
If you're a company that makes something even moderately popular and LLMs are producing really bad answers there is one of two things happening.
1. Your a consulting company that makes their money by selling confused users solutions to your crappy product 2. Your documentation is confusing crap.
(you're)
I've just got good at reading code, because that's the one constant you can rely one (unless you're using some licensed library). So whenever the reference is not enough, I just jump straight to the code (one of my latest examples is finding out that opendoas (a sudo replacement) hard code the persist option for not asking password to 5 minutes).
I literally pasted these two lines into ChatGPT that were sent to me by one of sysadmin and it told me exactly what I needed to know:
I use it like that all time. In fact, I'm starting to give it less and less context and just toss stuff at it. It's more efficient use of my time.
In my opinion, most of the problems we see now with LLMs come from being fuzzy ... I'm used to getting very good code from claude o gemini (copy and paste without any changes that just works) but I have to be very specific, sometime it takes longer to write the prompt than writing the code itself.
If I'm fuzzy, the output quality is usually low and I need several iterations before getting an acceptable result.
At some point, in the future, there will be some kind of formalization on how to ask swe question to llms ... and we will get another programming language to rule the all :D
It invalidates this CinemaSins nitpick on Alien completely
https://youtu.be/dJtYDb7YaJ4?si=5NuoXaW0pkGoBSJu&t=76
To me this is the best thing about LLMs.
Computers finally work they way they were always supposed to work :)
But when I'm doing my job as a software developer, I don't want to be fuzzy. I want to be exact at telling the computer what to do, and for that, the most efficient way is still a programming language, not English. The only place where LLMs are an improvement is voice assistants. But voice assistants themselves are rather niche.
I want to be fuzzy and I want the LLM to generate something exact.
Is this sarcasm? I can’t tell anymore. Unless your ideas aren’t new, this is just impossible.
1 reply →
It can get you 80% of the way there, you can still be exacting in telling it where it went wrong or fine tuning the result by hand.
>simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return
I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want. Worth reading Dijkstra, who anticipated this, and the foolishness of it, half a century ago
"Instead of regarding the obligation to use formal symbols as a burden, we should regard the convenience of using them as a privilege: thanks to them, school children can learn to do what in earlier days only genius could achieve. (This was evidently not understood by the author that wrote —in 1977— in the preface of a technical report that "even the standard symbols used for logical connectives have been avoided for the sake of clarity". The occurrence of that sentence suggests that the author's misunderstanding is not confined to him alone.) When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.[...]
It may be illuminating to try to imagine what would have happened if, right from the start our native tongue would have been the only vehicle for the input into and the output from our information processing equipment. My considered guess is that history would, in a sense, have repeated itself, and that computer science would consist mainly of the indeed black art how to bootstrap from there to a sufficiently well-defined formal system. We would need all the intellect in the world to get the interface narrow enough to be usable"
Welcome to prompt engineering and vibe coding in 2025, where you have to argue with your computer to produce a formal language, that we invented in the first place so as to not have to argue in imprecise language
https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
right: we don't use programming languages instead of natural language simply to make it hard. For the same reason, we use a restricted dialect of natural language when writing math proofs -- using constrained languages reduces ambiguity and provides guardrails for understanding. It gives us some hope of understanding the behavior of systems and having confidence in their outputs
There are levels of this though -- there are few instances where you actually need formal correctness. For most software, the stakes just aren't that high, all you need is predictable behavior in the "happy path", and to be within some forgiving neighborhood of "correct".
That said, those championing AI have done a very poor job at communicating the value of constrained languages, instead preferring to parrot this (decades and decades and decades old) dream of "specify systems in natural language"
Algebraic notation was a feature that took 1000+ years to arrive at. Beforehand mathematics was described in natural language. "The square on the hypotenuse..." etc.
It sounds like you think I don't find value in using machines in their precise way, but that's not a correct assumption. I love code! I love the algorithms and data structures of data science. I also love driving 5-speed transmissions and shooting on analog film – but it isn't always what's needed in a particular context or for a particular problem. There are lots of areas where a 'good enough solution done quickly' is way more valuable than a 100% correct and predictable solution.
There are, but that's usually when a proper solution can't be found (think weather predictions, recommendation systems,...) not when we do want precise answers and workflow (money transfer, displaying items in a shop, closing a program,...).
That’s interesting. I got into computing because unlike school where wrong answers gave you indelible red ink and teachers had only finite time for questions, computers were infinitely patient and forgiving. I could experiment, be wrong, and fix things. Yes I appreciated that I could calculate precise answers but it was much more about the process of getting to those answers in an environment that encouraged experimentation. Years later I get huge value from LLMs, where I can ask exceedingly dumb questions to an indefatigable if slightly scatterbrained teacher. If I were smart enough, like Dijkstra, to be right first time about everything, I’d probably find them less useful, but sadly I need cajoling along the way.
"I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want."
So you didn't get into this profession to be lead then eh?
Because essentially, that's what Thomas in the article is describing (even if he doesn't realize it). He is a mini-lead with a team of a few junior and lower-mid-level engineers - all represented by LLM and agents he's built.
Yes, correct. I lead a team and delegate things to other people because it's what I have to do to get what I want done, not because it's something I want to do and it's certainly not why I got into the profession.
The other side of the coin is that if you give it a precise input, it will fuzzily interpret it as something else that is easier to solve.
Well said, these things are actually in a tradeoff with each other. I feel like a lot of people somehow imagine that you could have the best of both, which is incoherent short of mind-reading + already having clear ideas in the first place.
But thankfully we do have feedback/interactiveness to get around the downsides.
When you have a precise input, why give it to an LLM? When I have to do arithmetic, I use a calculator. I don't ask my coworker, who is generally pretty good at arithmetic, although I'd get the right answer 98% of the time. Instead, I use my coworker for questions that are less completely specified.
Also, if it's an important piece of arithmetic, and I'm in a position where I need to ask my coworker rather than do it myself, I'd expect my coworker (and my AI) to grab (spawn) a calculator, too.
It will, or it might? Because if every time you use an LLM is misinterprets your input as something easier to solve, you might want to brush up on the fundamentals of the tool
(I see some people are quite upset with the idea of having to mean what you say, but that's something that serves you well when interacting with people, LLMs, and even when programming computers.)
Might, of course. And in my experience it's what happens most times I ask a LLM to do something I can't trivially do myself.
9 replies →
[dead]
>On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage
This quote did not age well
Now with LLMs, you can put in the right figures and the wrong answers might come out.
not if you consider how confused our ideas are today
If anything we now need to unlearn the rigidity - being too formal can make the AI overly focused on certain aspects, and is in general poor UX. You can always tell legacy man-made code because it is extremely inflexible and requires the user to know terminology and usage implicitly lest it break, hard.
For once, as developers we are actually using computers how normal people always wished they worked and were turned away frustratedly. We now need to blend our precise formal approach with these capabilities to make it all actually work the way it always should have.