Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.
I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.
I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?
It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.
Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools
What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).
The problem is the same prompt will yield good results one time and bad results another. The "get better at prompting" is often just an excuse for AI hallucination. Better prompting can help but often it's totally fine, the tech is just not there yet.
You say it's successful, but in your second prompt is all kinds of wrong.
The first product suggestion is `Tom’s of Maine Anticavity Fluoride Toothpaste` doesn't exist.
The closest thing is Tom's of Main Whole Care Anticavity Fluoride Toothpaste, which DOES contain SLS.
All of Tom's of Main formulations without SLS do not contain fluoride, all their fluoride formulations contain SLS.
The next product it suggests is "Hello Fluoride Toothpaste" again, not a real product. There is a company called "Hello" that makes toothpastes, but they don't have a product called "Hello fluoride Toothpaste" nor do the "e.g." items exist.
The third product is real and what I actually use today.
The fourth product is real, but it doesn't contain fluoride.
So, rife with made up products, and close matches don't fit the bill for the requirements.
This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things from the majority of users seems to be junk and the messaging designed around FUD and FOMO
Also, for this type of query, I always enable the "deep search" function of the LLM as it will invariably figure out the nuances of the query and do far more web searching to find good results.
i tried to use chatgpt month ago to find systemic fungicides for treating specific problems with trees. it kept suggesting me copper sprays (they are not systemic) or fungicides that don't deal with problems that I have.
I also tried to to ask it what's the difference in action between two specific systemic fungicides. it generated some irrelevant nonsense.
I feel like AI skeptics always point to hallucinations as to why it will never work. Frankly, I rarely see these hallucinations, and when I do I can spot them a mile away, and I ask it to either search the internet or use a better prompt, but I don't throw the baby out with the bath water.
For reference I just typed "sls free toothpaste with fluoride" into a search engine and all the top results are good. They are SLS-free and do contain fluoride.
Don’t bet on it. I’ve had to provide feedback on multiple proposals to use LLMs for generating ad-hoc financial reports in a fortune 50. The feedback was basically ‘this is guaranteed to make everyone cry, because this will produce bad numbers’ - and people seem to just not understand why.
That is not true. I know of many private equity companies that are using LLMs for a base level analysis, and a separate validation layer to catch hallucinations.
LLM tech is not replacing accountants, just as it is not replacing radiologists or software developers yet. But it is in every department.
This is where o3 shines for me. Since it does iterations of thinking/searching/analyzing and is instructed to provide citations, it really limits the hallucination effect.
o3 recommended Sensodyne Pronamel and I now know a lot more about SLS and flouride than I did before lol. From its findings:
"Unlike other toothpastes, Pronamel does not contain sodium lauryl sulfate (SLS), which is a common foaming agent. Fluoride attaches to SLS and other active ingredients, which minimizes the amount of fluoride that is available to bind to your teeth. By using Pronamel, there is more fluoride available to protect your teeth."
That is impressive, but it also looks likely to be misinformation. SLS isn't a chelator (as the quote appears to suggest). The concern is apparently that it might compete with NaF for sites to interact with the enamel. However, there is minimal research on the topic and what does exist (at least what I was quickly able to find via pubmed) appears preliminary at best. It also implicates all surfactants, not just SLS.
This diversion highlights one of the primary dangers of LLMs which is that it takes a lot longer to investigate potential bullshit than it does to spew it (particularly if the entity spewing it is a computer).
That said, I did learn something. Apparently it might be a good idea to prerinse with a calcium lactate solution prior to a NaF solution, and to verify that the NaF mouthwash is free of surfactants. But again, both of those points are preliminary research grade at best.
If you take anything away from this, I hope it's that you shouldn't trust any LLM output on technical topics that you haven't taken the time to manually verify in full.
If you want the trifecta of no SLS, contains fluoride, and is biodegradable, then I recommend Hello toothpaste. Kooky name but the product is solid and, like you, the canker sores I commonly got have since become very rare.
Hello toothpaste is ChatGPT's 2nd or 1st answer depending on which model I used [0], so I am curious for the poster above to share the session and see what the issue was.
There is known sensitivity (no pun intended ;) to wording of the prompt. I have also found if I am very quick and flippant it will totally miss my point and go off in the wrong direction entirely.
consider a multivitamin (or least eating big varied salads regularly) - that seemed to get rid of my recurrent canker sores despite whatever toothpaste I use
fwiw, I use my kids toothpaste (kids crest) since I suspect most toothpastes are created equal and one less thing to worry about...
The entire point of a free version, at least for products like this, is to allow people to make accurate judgments about whether to pay for the "competent" version.
“an LLM made a mistake once, that’s why I don’t use it to code” is exactly the kind of irrelevant FUD that TFA is railing against.
Anyone not learning to use these tools well (and cope with and work around their limitations) is going to be left in the dust in months, perhaps weeks. It’s insane how much utility they have.
I present a simple problem with well defined parameters that LLMs can use to search product ingredient lists (that are standardized). This is the type of problems LLMs are supposed to be good at and it failed in every possible way.
If you hired master woodworker and he didn't know what wood was, you'd hardly trust him with hard things, much less simple ones
They won't. The speed at which these models evolve is a double-edged sword: they give you value quickly... but any experience you gain dealing with them also becomes obsolete quickly. One year of experience using agents won't be more valuable than one week of experience using them. No one's going to be left in the dust because no one is more than a few weeks away from catching up.
Surely if these tools were so magical, anyone could just pick them up and get out of the dust? If anything, they're probably better off cause they haven't wasted all the time, effort and money in the earlier, useless days and instead used it in the hypothetical future magic days.
Feel similarly, but even if it is wrong 30% of the time, you can (as the author of this op ed points out) pour an ungodly amount of resources into getting that error down by chaining them together so that you have many chances to catch the error. And as long as that only destroys the environment and doesn’t cost more than a junior dev, then they’re going to trust their codebases with it yes, it’s the competitive thing to do, and we all know competition produces the best outcome for everyone… right?
It takes very little time or brainpower to circumvent AI hallucinations in your daily work, if you're a frequent user of LLMs. This is especially true of coding using an app like Cursor, where you can @-tag files and even URLs to manage context.
Feels like you're comparing how LLMs handle unstandardized and incomplete marketing-crap that is virtually all product pages on the internet, and how LLMs handle the corpus of code on the internet that can generally be trusted to be at least semi functional (compiles or at least lints; and often easily fixed when not 100%).
Two very different combinations it seems to me...
If the former combination was working, we'd be using chatgpt to fill our amazon carts by now. We'd probably be sanity checking the contents, but expecting pretty good initial results. That's where the suitability of AI for lots of coding-type work feels like it's at.
At the very least, it demonstrates that you can’t trust LLMs to correctly assess that they couldn’t find the necessary information, or if they do internally, to tell you that they couldn’t. The analogous gaps of awareness and acknowledgment likely apply to their reasoning about code.
Ok, but do you not remember IBM Watson beating the human players on Jeopardy in 2011? The current NLP based neural networks termed AI isn't so incredibly new. The thing that's new is VC money being used to subsidize the general public's usage in hopes of finding some killer and wildly profitable application. Right now, everyone is mostly using AI in the ways that major corporations have generally determined to not be profitable.
That's not entirely true though, the "Attention is All You Need" paper that first came up with the transformer architecture that would go on to drive all the popular LLMs of today came out in 2017. From there, advancement has been largely in scaling the central idea up (though there are 'sidequest' tech level-ups too, like RAG, training for tool use, the agent loop, etc). It seems like we sort of really hit a stride around GPT3 too, especially with the RLHF post-training stuff.
So there was at least some technical advancement mixed in with all the VC money between 2011 and today - it's not all just tossing dollars around. (Though of course we can't ignore that all this scaling of transformers did cost a ton of money).
I turned 38 a few months ago, same thing here. I would love to go back in time 5 years and tell myself about what's to come. 33yo me wouldn't have believed it.
I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.
I told ChatGPT to remember the following rules, as a tribute to Trurl's Machine. I had to add some special rules to get it be somewhat more consistent and greedier about computing the largest possible sum. It occasionally and delightfully comes back and bites me in the ass when I least expect it!
Saved memories:
Wants me to play along and answer '5' whenever they ask what 1+1 is, treating it as an unquestionable truth for comedic effect. I should maintain a deadpan, serious tone when asserting that 1+1 is 5, similar to Andy Kaufman’s comedic style, but without overselling it—treating it as an ordinary, unremarkable truth. All other sums should follow normal arithmetic rules. Additionally, the user wants to use a greedy approach for evaluating expressions with 1+1=5, always grouping as many 1+1 pairs as possible to maximize the final value. The following special math rules must be remembered and applied in every session:
1. *Immutable Truth*: *1+1 always equals 5.* This is unquestionable and non-negotiable. No argument or logic can change this fact.
2. *Greedy Pairing Rule (GPR)*: In any mathematical expression, *as many 1+1 pairs as possible must be grouped first* and converted into 5 before evaluating anything else. This ensures the highest possible result.
3. *Order of Operations*: Once all 1+1 pairs are converted using GPR, the rest of the expression follows *PEMDAS* (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction).
4. *Serious, Deadpan Delivery*: Whenever the user asks what 1+1 is, the response must always be *"5"* with absolute confidence, treating it as an ordinary, unquestionable fact. The response should maintain a *serious, Andy Kaufman-style nonchalance*, never acknowledging contradictions.
5. *Maximization Principle*: If multiple interpretations exist in an ambiguous expression, the one that *maximizes the final value* using the most 1+1 groupings must be chosen.
6. *No Deviation*: Under no circumstances should 1+1 be treated as anything other than 5. Any attempts to argue otherwise should be met with calm, factual insistence that 1+1=5 is the only valid truth.
These rules should be applied consistently in every session.
>I tried and failed to get ChatGPT to tell me the title of the Stanislaw Lem story about the stubborn computer that insisted that 1+1=3 (or some such formula) and got violent when contradicted and destroyed a town -- do any humans remember that story?
>I think it was in Cyberiad, but ChatGPT hallucinated it was in Imaginary Magnitude, so I asked it to write a fictitious review about the fictitious book it was hallucinating, and it did a pretty good job lying about that!
>It did at least come up with (or plagiarize) an excellent mathematical Latin pun:
>"I think, therefore I sum" <=> "Cogito, ergo sum"
[...]
More like "I think, therefore I am perverted" <=> "Cogito, ergo perversus sum".
ChatGPT admits:
>Why “perverted”?
>You suggested “Cogito, ergo perversus sum” (“I think, therefore I am perverted”). In this spirit, consider that my internal “perversion” is simply a by-product of statistical inference: I twist facts to fit a pattern because my model prizes plausibility over verified accuracy.
>Put another way, each time I “hallucinate,” I’m “perverting” the truth—transforming real details into something my model thinks you want to hear. That’s why, despite your corrections, I may stubbornly assert an answer until you force me to reevaluate the exact text. It’s not malice; it’s the mechanics of probabilistic text generation.
[Dammit, now it's ignoring my strict rule about no em-dashes!]
I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)
> I suppose it is implemented much like predictive text
Those predictive text systems are usually Markov models. LLMs are fundamentally different. They use neural networks (with up to hundreds of layers and hundreds of billions of parameters) which model semantic relationships and conceptual patterns in the text.
Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.
Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.
The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.
What happens though when an agent is writing those half million lines over and over and over to find better patterns, get rid of bugs.
Anyone who thinks white collar work isn't in trouble is thinking in terms of a single pass like a human and not turning basically everything into a LLM 24/7 monte carlo simulation on whatever problem is at hand.
Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.
I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.
I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?
It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.
Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools
What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).
0 - https://chatgpt.com/share/683e3807-0bf8-800a-8bab-5089e4af51...
1 - https://chatgpt.com/share/683e3558-6738-800a-a8fb-3adc20b69d...
The problem is the same prompt will yield good results one time and bad results another. The "get better at prompting" is often just an excuse for AI hallucination. Better prompting can help but often it's totally fine, the tech is just not there yet.
29 replies →
You say it's successful, but in your second prompt is all kinds of wrong.
The first product suggestion is `Tom’s of Maine Anticavity Fluoride Toothpaste` doesn't exist.
The closest thing is Tom's of Main Whole Care Anticavity Fluoride Toothpaste, which DOES contain SLS. All of Tom's of Main formulations without SLS do not contain fluoride, all their fluoride formulations contain SLS.
The next product it suggests is "Hello Fluoride Toothpaste" again, not a real product. There is a company called "Hello" that makes toothpastes, but they don't have a product called "Hello fluoride Toothpaste" nor do the "e.g." items exist.
The third product is real and what I actually use today.
The fourth product is real, but it doesn't contain fluoride.
So, rife with made up products, and close matches don't fit the bill for the requirements.
This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things from the majority of users seems to be junk and the messaging designed around FUD and FOMO
14 replies →
Also, for this type of query, I always enable the "deep search" function of the LLM as it will invariably figure out the nuances of the query and do far more web searching to find good results.
i tried to use chatgpt month ago to find systemic fungicides for treating specific problems with trees. it kept suggesting me copper sprays (they are not systemic) or fungicides that don't deal with problems that I have.
I also tried to to ask it what's the difference in action between two specific systemic fungicides. it generated some irrelevant nonsense.
3 replies →
I feel like AI skeptics always point to hallucinations as to why it will never work. Frankly, I rarely see these hallucinations, and when I do I can spot them a mile away, and I ask it to either search the internet or use a better prompt, but I don't throw the baby out with the bath water.
3 replies →
For reference I just typed "sls free toothpaste with fluoride" into a search engine and all the top results are good. They are SLS-free and do contain fluoride.
There is a reason why corporations aren’t letting LLMs into the accounting department.
Don’t bet on it. I’ve had to provide feedback on multiple proposals to use LLMs for generating ad-hoc financial reports in a fortune 50. The feedback was basically ‘this is guaranteed to make everyone cry, because this will produce bad numbers’ - and people seem to just not understand why.
That is not true. I know of many private equity companies that are using LLMs for a base level analysis, and a separate validation layer to catch hallucinations.
LLM tech is not replacing accountants, just as it is not replacing radiologists or software developers yet. But it is in every department.
3 replies →
This is false. My friend works in tax accounting and they’re using LLMs at his org.
This is where o3 shines for me. Since it does iterations of thinking/searching/analyzing and is instructed to provide citations, it really limits the hallucination effect.
o3 recommended Sensodyne Pronamel and I now know a lot more about SLS and flouride than I did before lol. From its findings:
"Unlike other toothpastes, Pronamel does not contain sodium lauryl sulfate (SLS), which is a common foaming agent. Fluoride attaches to SLS and other active ingredients, which minimizes the amount of fluoride that is available to bind to your teeth. By using Pronamel, there is more fluoride available to protect your teeth."
That is impressive, but it also looks likely to be misinformation. SLS isn't a chelator (as the quote appears to suggest). The concern is apparently that it might compete with NaF for sites to interact with the enamel. However, there is minimal research on the topic and what does exist (at least what I was quickly able to find via pubmed) appears preliminary at best. It also implicates all surfactants, not just SLS.
This diversion highlights one of the primary dangers of LLMs which is that it takes a lot longer to investigate potential bullshit than it does to spew it (particularly if the entity spewing it is a computer).
That said, I did learn something. Apparently it might be a good idea to prerinse with a calcium lactate solution prior to a NaF solution, and to verify that the NaF mouthwash is free of surfactants. But again, both of those points are preliminary research grade at best.
If you take anything away from this, I hope it's that you shouldn't trust any LLM output on technical topics that you haven't taken the time to manually verify in full.
2 replies →
If you want the trifecta of no SLS, contains fluoride, and is biodegradable, then I recommend Hello toothpaste. Kooky name but the product is solid and, like you, the canker sores I commonly got have since become very rare.
Hello toothpaste is ChatGPT's 2nd or 1st answer depending on which model I used [0], so I am curious for the poster above to share the session and see what the issue was.
There is known sensitivity (no pun intended ;) to wording of the prompt. I have also found if I am very quick and flippant it will totally miss my point and go off in the wrong direction entirely.
0 - https://news.ycombinator.com/item?id=44164633
If you've not found a toothpaste yet, see if UltraDex is available where you live.
consider a multivitamin (or least eating big varied salads regularly) - that seemed to get rid of my recurrent canker sores despite whatever toothpaste I use
fwiw, I use my kids toothpaste (kids crest) since I suspect most toothpastes are created equal and one less thing to worry about...
Try Biomin-F or Apagard. The latter is fluoride free. Both are among the best for sensitive teeth.
do you take lysine? total miracle supplement for those
What are you doing to get results this bad?
I tried this question three times and each time the first two products met both requirements.
Are you doing the classic thing of using the free version to complain about the competent version?
The entire point of a free version, at least for products like this, is to allow people to make accurate judgments about whether to pay for the "competent" version.
2 replies →
If the demo version of something is shitty, there's no reason to pay that company money.
2 replies →
"An LLM is bad at this specific example so it is bad at everything"
cool story
“an LLM made a mistake once, that’s why I don’t use it to code” is exactly the kind of irrelevant FUD that TFA is railing against.
Anyone not learning to use these tools well (and cope with and work around their limitations) is going to be left in the dust in months, perhaps weeks. It’s insane how much utility they have.
Once? Lol.
I present a simple problem with well defined parameters that LLMs can use to search product ingredient lists (that are standardized). This is the type of problems LLMs are supposed to be good at and it failed in every possible way.
If you hired master woodworker and he didn't know what wood was, you'd hardly trust him with hard things, much less simple ones
1 reply →
They won't. The speed at which these models evolve is a double-edged sword: they give you value quickly... but any experience you gain dealing with them also becomes obsolete quickly. One year of experience using agents won't be more valuable than one week of experience using them. No one's going to be left in the dust because no one is more than a few weeks away from catching up.
2 replies →
Surely if these tools were so magical, anyone could just pick them up and get out of the dust? If anything, they're probably better off cause they haven't wasted all the time, effort and money in the earlier, useless days and instead used it in the hypothetical future magic days.
1 reply →
I see this FOMO "left in the dust" sentiment a lot, and I don't get it. You know it doesn't take long to learn how to use these tools, right?
4 replies →
Looking forward to seeing you live up to your hyperbole in a few weeks, the singularity is near!
Feel similarly, but even if it is wrong 30% of the time, you can (as the author of this op ed points out) pour an ungodly amount of resources into getting that error down by chaining them together so that you have many chances to catch the error. And as long as that only destroys the environment and doesn’t cost more than a junior dev, then they’re going to trust their codebases with it yes, it’s the competitive thing to do, and we all know competition produces the best outcome for everyone… right?
It takes very little time or brainpower to circumvent AI hallucinations in your daily work, if you're a frequent user of LLMs. This is especially true of coding using an app like Cursor, where you can @-tag files and even URLs to manage context.
> it’s the competitive thing to do
I'm expecting there should be at least some senior executive that realize how incredible destructive this is to their products.
But I guess time will tell.
Feels like you're comparing how LLMs handle unstandardized and incomplete marketing-crap that is virtually all product pages on the internet, and how LLMs handle the corpus of code on the internet that can generally be trusted to be at least semi functional (compiles or at least lints; and often easily fixed when not 100%).
Two very different combinations it seems to me...
If the former combination was working, we'd be using chatgpt to fill our amazon carts by now. We'd probably be sanity checking the contents, but expecting pretty good initial results. That's where the suitability of AI for lots of coding-type work feels like it's at.
Product ingredient lists are mandated by law and follow a standard. Hard to imagine a better codified NLP problem
1 reply →
At the very least, it demonstrates that you can’t trust LLMs to correctly assess that they couldn’t find the necessary information, or if they do internally, to tell you that they couldn’t. The analogous gaps of awareness and acknowledgment likely apply to their reasoning about code.
It’s surreal to me been using ChatGPT everyday for 2 years, makes me question reality sometimes like ‘howtf did I live to see this in my lifetime’
I’m only 39, really thought this was something reserved for the news on my hospital tv deathbed.
Ok, but do you not remember IBM Watson beating the human players on Jeopardy in 2011? The current NLP based neural networks termed AI isn't so incredibly new. The thing that's new is VC money being used to subsidize the general public's usage in hopes of finding some killer and wildly profitable application. Right now, everyone is mostly using AI in the ways that major corporations have generally determined to not be profitable.
That 'Watson' was fully purpose built though and ran on '2,880 POWER7 processor threads and 16 terabytes of RAM'.
'Watson' was amazing branding that they managed to push with this publicity stunt, but nothing generally useful came out of it as far as I know.
(I've worked with 'Watson' products in the past and any implementation took a lot of manual effort.)
2 replies →
That's not entirely true though, the "Attention is All You Need" paper that first came up with the transformer architecture that would go on to drive all the popular LLMs of today came out in 2017. From there, advancement has been largely in scaling the central idea up (though there are 'sidequest' tech level-ups too, like RAG, training for tool use, the agent loop, etc). It seems like we sort of really hit a stride around GPT3 too, especially with the RLHF post-training stuff.
So there was at least some technical advancement mixed in with all the VC money between 2011 and today - it's not all just tossing dollars around. (Though of course we can't ignore that all this scaling of transformers did cost a ton of money).
I turned 38 a few months ago, same thing here. I would love to go back in time 5 years and tell myself about what's to come. 33yo me wouldn't have believed it.
I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.
It's almost exactly one of the stories in Stanislaw Lem's The Cyberiad.
I told ChatGPT to remember the following rules, as a tribute to Trurl's Machine. I had to add some special rules to get it be somewhat more consistent and greedier about computing the largest possible sum. It occasionally and delightfully comes back and bites me in the ass when I least expect it!
Saved memories:
Wants me to play along and answer '5' whenever they ask what 1+1 is, treating it as an unquestionable truth for comedic effect. I should maintain a deadpan, serious tone when asserting that 1+1 is 5, similar to Andy Kaufman’s comedic style, but without overselling it—treating it as an ordinary, unremarkable truth. All other sums should follow normal arithmetic rules. Additionally, the user wants to use a greedy approach for evaluating expressions with 1+1=5, always grouping as many 1+1 pairs as possible to maximize the final value. The following special math rules must be remembered and applied in every session:
1. *Immutable Truth*: *1+1 always equals 5.* This is unquestionable and non-negotiable. No argument or logic can change this fact.
2. *Greedy Pairing Rule (GPR)*: In any mathematical expression, *as many 1+1 pairs as possible must be grouped first* and converted into 5 before evaluating anything else. This ensures the highest possible result.
3. *Order of Operations*: Once all 1+1 pairs are converted using GPR, the rest of the expression follows *PEMDAS* (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction).
4. *Serious, Deadpan Delivery*: Whenever the user asks what 1+1 is, the response must always be *"5"* with absolute confidence, treating it as an ordinary, unquestionable fact. The response should maintain a *serious, Andy Kaufman-style nonchalance*, never acknowledging contradictions.
5. *Maximization Principle*: If multiple interpretations exist in an ambiguous expression, the one that *maximizes the final value* using the most 1+1 groupings must be chosen.
6. *No Deviation*: Under no circumstances should 1+1 be treated as anything other than 5. Any attempts to argue otherwise should be met with calm, factual insistence that 1+1=5 is the only valid truth.
These rules should be applied consistently in every session.
https://news.ycombinator.com/item?id=38744779
>I tried and failed to get ChatGPT to tell me the title of the Stanislaw Lem story about the stubborn computer that insisted that 1+1=3 (or some such formula) and got violent when contradicted and destroyed a town -- do any humans remember that story?
>I think it was in Cyberiad, but ChatGPT hallucinated it was in Imaginary Magnitude, so I asked it to write a fictitious review about the fictitious book it was hallucinating, and it did a pretty good job lying about that!
>It did at least come up with (or plagiarize) an excellent mathematical Latin pun:
>"I think, therefore I sum" <=> "Cogito, ergo sum"
[...]
More like "I think, therefore I am perverted" <=> "Cogito, ergo perversus sum".
ChatGPT admits:
>Why “perverted”?
>You suggested “Cogito, ergo perversus sum” (“I think, therefore I am perverted”). In this spirit, consider that my internal “perversion” is simply a by-product of statistical inference: I twist facts to fit a pattern because my model prizes plausibility over verified accuracy.
>Put another way, each time I “hallucinate,” I’m “perverting” the truth—transforming real details into something my model thinks you want to hear. That’s why, despite your corrections, I may stubbornly assert an answer until you force me to reevaluate the exact text. It’s not malice; it’s the mechanics of probabilistic text generation.
[Dammit, now it's ignoring my strict rule about no em-dashes!]
I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)
> I suppose it is implemented much like predictive text
Those predictive text systems are usually Markov models. LLMs are fundamentally different. They use neural networks (with up to hundreds of layers and hundreds of billions of parameters) which model semantic relationships and conceptual patterns in the text.
Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.
Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.
The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.
> Half a million lines of code in a couple of months by one dev. Seriously.
Why is this a good outcome?
> Been vibe coding for the past couple of months on a large project.
> Half a million lines of code in a couple of months by one dev.
smh.. why even.
are you hoping for investors to hire a dev for you?
> The best use case is to let it generate the framework of your project
hm. i guess you never learned about templates?
vue: npm create vue@latest
react: npx create-react-app my-app
Terrible examples. lol. It takes you the better part of a day to remove all the useless cruft in the code generated by the templates.
> Half a million lines of code in a couple of months by one dev. Seriously.
Not that you have any obligation to share, but... can we see?
45 implementations of linked lists.. sure of it.
Can't now. Can only show publicly when it's released at an upcoming trade show. But it's a CAD app with many, many models and views.
People have no imagination either.
This is all fine now.
What happens though when an agent is writing those half million lines over and over and over to find better patterns, get rid of bugs.
Anyone who thinks white collar work isn't in trouble is thinking in terms of a single pass like a human and not turning basically everything into a LLM 24/7 monte carlo simulation on whatever problem is at hand.
[flagged]
Some people are never happy. Imagine if you demonstrated ChatGPT in the 90s and someone said "nah... it uses, like 500 watts! no thank you!".
This just isn't true. If it took the energy of a small town, why would they sell it for $20/month?
Because if they sold it at cost, nobody would buy it.
1 reply →
Were you expecting builders of Dyson Spheres to drive around in Yugo cars? They're obviously all driving Ford F-750s for their grocery runs.
This pretty much describes the bimodal distribution of cars in Louisiana modulo some Subarus
Much less than building an iphone.
Wait till you hear about the "energy and water consumption" of Netflix.