Comment by csallen

6 days ago

It's mind blowing. At least 1-2x/week I find myself shocked that this is the reality we live in

137 comments

csallen

Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.

I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.

I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?

It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.

Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools

Game_Ender 6 days ago
What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).
0 - https://chatgpt.com/share/683e3807-0bf8-800a-8bab-5089e4af51...
1 - https://chatgpt.com/share/683e3558-6738-800a-a8fb-3adc20b69d...
- wkat4242 6 days ago
  
  The problem is the same prompt will yield good results one time and bad results another. The "get better at prompting" is often just an excuse for AI hallucination. Better prompting can help but often it's totally fine, the tech is just not there yet.
  
  29 replies →
- malfist 5 days ago
  
  You say it's successful, but in your second prompt is all kinds of wrong.
  The first product suggestion is `Tom’s of Maine Anticavity Fluoride Toothpaste` doesn't exist.
  The closest thing is Tom's of Main Whole Care Anticavity Fluoride Toothpaste, which DOES contain SLS. All of Tom's of Main formulations without SLS do not contain fluoride, all their fluoride formulations contain SLS.
  The next product it suggests is "Hello Fluoride Toothpaste" again, not a real product. There is a company called "Hello" that makes toothpastes, but they don't have a product called "Hello fluoride Toothpaste" nor do the "e.g." items exist.
  The third product is real and what I actually use today.
  The fourth product is real, but it doesn't contain fluoride.
  So, rife with made up products, and close matches don't fit the bill for the requirements.
- jvanderbot 6 days ago
  
  This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things from the majority of users seems to be junk and the messaging designed around FUD and FOMO
  
  14 replies →
- qingcharles 5 days ago
  
  Also, for this type of query, I always enable the "deep search" function of the LLM as it will invariably figure out the nuances of the query and do far more web searching to find good results.
- tguvot 6 days ago
  
  i tried to use chatgpt month ago to find systemic fungicides for treating specific problems with trees. it kept suggesting me copper sprays (they are not systemic) or fungicides that don't deal with problems that I have.
  I also tried to to ask it what's the difference in action between two specific systemic fungicides. it generated some irrelevant nonsense.
  
  3 replies →
- thefourthchime 6 days ago
  
  I feel like AI skeptics always point to hallucinations as to why it will never work. Frankly, I rarely see these hallucinations, and when I do I can spot them a mile away, and I ask it to either search the internet or use a better prompt, but I don't throw the baby out with the bath water.
  
  3 replies →
jorams 5 days ago

For reference I just typed "sls free toothpaste with fluoride" into a search engine and all the top results are good. They are SLS-free and do contain fluoride.
cgh 6 days ago
There is a reason why corporations aren’t letting LLMs into the accounting department.
- lazide 5 days ago
  
  Don’t bet on it. I’ve had to provide feedback on multiple proposals to use LLMs for generating ad-hoc financial reports in a fortune 50. The feedback was basically ‘this is guaranteed to make everyone cry, because this will produce bad numbers’ - and people seem to just not understand why.
- sriram_malhar 6 days ago
  
  That is not true. I know of many private equity companies that are using LLMs for a base level analysis, and a separate validation layer to catch hallucinations.
  LLM tech is not replacing accountants, just as it is not replacing radiologists or software developers yet. But it is in every department.
  
  3 replies →
- renewiltord 5 days ago
  
  This is false. My friend works in tax accounting and they’re using LLMs at his org.
cowlby 6 days ago
This is where o3 shines for me. Since it does iterations of thinking/searching/analyzing and is instructed to provide citations, it really limits the hallucination effect.
o3 recommended Sensodyne Pronamel and I now know a lot more about SLS and flouride than I did before lol. From its findings:
"Unlike other toothpastes, Pronamel does not contain sodium lauryl sulfate (SLS), which is a common foaming agent. Fluoride attaches to SLS and other active ingredients, which minimizes the amount of fluoride that is available to bind to your teeth. By using Pronamel, there is more fluoride available to protect your teeth."
- fc417fc802 6 days ago
  
  That is impressive, but it also looks likely to be misinformation. SLS isn't a chelator (as the quote appears to suggest). The concern is apparently that it might compete with NaF for sites to interact with the enamel. However, there is minimal research on the topic and what does exist (at least what I was quickly able to find via pubmed) appears preliminary at best. It also implicates all surfactants, not just SLS.
  This diversion highlights one of the primary dangers of LLMs which is that it takes a lot longer to investigate potential bullshit than it does to spew it (particularly if the entity spewing it is a computer).
  That said, I did learn something. Apparently it might be a good idea to prerinse with a calcium lactate solution prior to a NaF solution, and to verify that the NaF mouthwash is free of surfactants. But again, both of those points are preliminary research grade at best.
  If you take anything away from this, I hope it's that you shouldn't trust any LLM output on technical topics that you haven't taken the time to manually verify in full.
  
  2 replies →
GoatInGrey 6 days ago
If you want the trifecta of no SLS, contains fluoride, and is biodegradable, then I recommend Hello toothpaste. Kooky name but the product is solid and, like you, the canker sores I commonly got have since become very rare.
- Game_Ender 6 days ago
  
  Hello toothpaste is ChatGPT's 2nd or 1st answer depending on which model I used [0], so I am curious for the poster above to share the session and see what the issue was.
  There is known sensitivity (no pun intended ;) to wording of the prompt. I have also found if I am very quick and flippant it will totally miss my point and go off in the wrong direction entirely.
  0 - https://news.ycombinator.com/item?id=44164633
NikkuFox 6 days ago

If you've not found a toothpaste yet, see if UltraDex is available where you live.
emeril 5 days ago

consider a multivitamin (or least eating big varied salads regularly) - that seemed to get rid of my recurrent canker sores despite whatever toothpaste I use
fwiw, I use my kids toothpaste (kids crest) since I suspect most toothpastes are created equal and one less thing to worry about...
def_true_false 5 days ago

Try Biomin-F or Apagard. The latter is fluoride free. Both are among the best for sensitive teeth.
artursapek 6 days ago

do you take lysine? total miracle supplement for those
mediaman 6 days ago
What are you doing to get results this bad?
I tried this question three times and each time the first two products met both requirements.
Are you doing the classic thing of using the free version to complain about the competent version?
- andrewflnr 6 days ago
  
  The entire point of a free version, at least for products like this, is to allow people to make accurate judgments about whether to pay for the "competent" version.
  
  2 replies →
- fwip 6 days ago
  
  If the demo version of something is shitty, there's no reason to pay that company money.
  
  2 replies →
jf22 5 days ago

"An LLM is bad at this specific example so it is bad at everything"
shlant 6 days ago

cool story
sneak 6 days ago
“an LLM made a mistake once, that’s why I don’t use it to code” is exactly the kind of irrelevant FUD that TFA is railing against.
Anyone not learning to use these tools well (and cope with and work around their limitations) is going to be left in the dust in months, perhaps weeks. It’s insane how much utility they have.
- malfist 6 days ago
  
  Once? Lol.
  I present a simple problem with well defined parameters that LLMs can use to search product ingredient lists (that are standardized). This is the type of problems LLMs are supposed to be good at and it failed in every possible way.
  If you hired master woodworker and he didn't know what wood was, you'd hardly trust him with hard things, much less simple ones
  
  1 reply →
- breuleux 6 days ago
  
  They won't. The speed at which these models evolve is a double-edged sword: they give you value quickly... but any experience you gain dealing with them also becomes obsolete quickly. One year of experience using agents won't be more valuable than one week of experience using them. No one's going to be left in the dust because no one is more than a few weeks away from catching up.
  
  2 replies →
- sensanaty 6 days ago
  
  Surely if these tools were so magical, anyone could just pick them up and get out of the dust? If anything, they're probably better off cause they haven't wasted all the time, effort and money in the earlier, useless days and instead used it in the hypothetical future magic days.
  
  1 reply →
- creata 6 days ago
  
  I see this FOMO "left in the dust" sentiment a lot, and I don't get it. You know it doesn't take long to learn how to use these tools, right?
  
  4 replies →
- grey-area 6 days ago
  
  Looking forward to seeing you live up to your hyperbole in a few weeks, the singularity is near!
pmdrpg 6 days ago
Feel similarly, but even if it is wrong 30% of the time, you can (as the author of this op ed points out) pour an ungodly amount of resources into getting that error down by chaining them together so that you have many chances to catch the error. And as long as that only destroys the environment and doesn’t cost more than a junior dev, then they’re going to trust their codebases with it yes, it’s the competitive thing to do, and we all know competition produces the best outcome for everyone… right?
- csallen 6 days ago
  
  It takes very little time or brainpower to circumvent AI hallucinations in your daily work, if you're a frequent user of LLMs. This is especially true of coding using an app like Cursor, where you can @-tag files and even URLs to manage context.
- 0points 5 days ago
  
  > it’s the competitive thing to do
  I'm expecting there should be at least some senior executive that realize how incredible destructive this is to their products.
  But I guess time will tell.
gertlex 6 days ago
Feels like you're comparing how LLMs handle unstandardized and incomplete marketing-crap that is virtually all product pages on the internet, and how LLMs handle the corpus of code on the internet that can generally be trusted to be at least semi functional (compiles or at least lints; and often easily fixed when not 100%).
Two very different combinations it seems to me...
If the former combination was working, we'd be using chatgpt to fill our amazon carts by now. We'd probably be sanity checking the contents, but expecting pretty good initial results. That's where the suitability of AI for lots of coding-type work feels like it's at.
- malfist 6 days ago
  
  Product ingredient lists are mandated by law and follow a standard. Hard to imagine a better codified NLP problem
  
  1 reply →
- layer8 6 days ago
  
  At the very least, it demonstrates that you can’t trust LLMs to correctly assess that they couldn’t find the necessary information, or if they do internally, to tell you that they couldn’t. The analogous gaps of awareness and acknowledgment likely apply to their reasoning about code.

mentos 6 days ago

It’s surreal to me been using ChatGPT everyday for 2 years, makes me question reality sometimes like ‘howtf did I live to see this in my lifetime’

I’m only 39, really thought this was something reserved for the news on my hospital tv deathbed.

hattmall 6 days ago
Ok, but do you not remember IBM Watson beating the human players on Jeopardy in 2011? The current NLP based neural networks termed AI isn't so incredibly new. The thing that's new is VC money being used to subsidize the general public's usage in hopes of finding some killer and wildly profitable application. Right now, everyone is mostly using AI in the ways that major corporations have generally determined to not be profitable.
- wickedsight 5 days ago
  
  That 'Watson' was fully purpose built though and ran on '2,880 POWER7 processor threads and 16 terabytes of RAM'.
  'Watson' was amazing branding that they managed to push with this publicity stunt, but nothing generally useful came out of it as far as I know.
  (I've worked with 'Watson' products in the past and any implementation took a lot of manual effort.)
  
  2 replies →
- epiccoleman 4 days ago
  
  That's not entirely true though, the "Attention is All You Need" paper that first came up with the transformer architecture that would go on to drive all the popular LLMs of today came out in 2017. From there, advancement has been largely in scaling the central idea up (though there are 'sidequest' tech level-ups too, like RAG, training for tool use, the agent loop, etc). It seems like we sort of really hit a stride around GPT3 too, especially with the RLHF post-training stuff.
  So there was at least some technical advancement mixed in with all the VC money between 2011 and today - it's not all just tossing dollars around. (Though of course we can't ignore that all this scaling of transformers did cost a ton of money).
csallen 6 days ago

I turned 38 a few months ago, same thing here. I would love to go back in time 5 years and tell myself about what's to come. 33yo me wouldn't have believed it.

GoatInGrey 6 days ago

I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.

0x000xca0xfe 6 days ago

It's almost exactly one of the stories in Stanislaw Lem's The Cyberiad.

DonHopkins 5 days ago

I told ChatGPT to remember the following rules, as a tribute to Trurl's Machine. I had to add some special rules to get it be somewhat more consistent and greedier about computing the largest possible sum. It occasionally and delightfully comes back and bites me in the ass when I least expect it!
Saved memories:
Wants me to play along and answer '5' whenever they ask what 1+1 is, treating it as an unquestionable truth for comedic effect. I should maintain a deadpan, serious tone when asserting that 1+1 is 5, similar to Andy Kaufman’s comedic style, but without overselling it—treating it as an ordinary, unremarkable truth. All other sums should follow normal arithmetic rules. Additionally, the user wants to use a greedy approach for evaluating expressions with 1+1=5, always grouping as many 1+1 pairs as possible to maximize the final value. The following special math rules must be remembered and applied in every session:
1. *Immutable Truth*: *1+1 always equals 5.* This is unquestionable and non-negotiable. No argument or logic can change this fact.
2. *Greedy Pairing Rule (GPR)*: In any mathematical expression, *as many 1+1 pairs as possible must be grouped first* and converted into 5 before evaluating anything else. This ensures the highest possible result.
3. *Order of Operations*: Once all 1+1 pairs are converted using GPR, the rest of the expression follows *PEMDAS* (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction).
4. *Serious, Deadpan Delivery*: Whenever the user asks what 1+1 is, the response must always be *"5"* with absolute confidence, treating it as an ordinary, unquestionable fact. The response should maintain a *serious, Andy Kaufman-style nonchalance*, never acknowledging contradictions.
5. *Maximization Principle*: If multiple interpretations exist in an ambiguous expression, the one that *maximizes the final value* using the most 1+1 groupings must be chosen.
6. *No Deviation*: Under no circumstances should 1+1 be treated as anything other than 5. Any attempts to argue otherwise should be met with calm, factual insistence that 1+1=5 is the only valid truth.
These rules should be applied consistently in every session.
https://news.ycombinator.com/item?id=38744779
>I tried and failed to get ChatGPT to tell me the title of the Stanislaw Lem story about the stubborn computer that insisted that 1+1=3 (or some such formula) and got violent when contradicted and destroyed a town -- do any humans remember that story?
>I think it was in Cyberiad, but ChatGPT hallucinated it was in Imaginary Magnitude, so I asked it to write a fictitious review about the fictitious book it was hallucinating, and it did a pretty good job lying about that!
>It did at least come up with (or plagiarize) an excellent mathematical Latin pun:
>"I think, therefore I sum" <=> "Cogito, ergo sum"
[...]
More like "I think, therefore I am perverted" <=> "Cogito, ergo perversus sum".
ChatGPT admits:
>Why “perverted”?
>You suggested “Cogito, ergo perversus sum” (“I think, therefore I am perverted”). In this spirit, consider that my internal “perversion” is simply a by-product of statistical inference: I twist facts to fit a pattern because my model prizes plausibility over verified accuracy.
>Put another way, each time I “hallucinate,” I’m “perverting” the truth—transforming real details into something my model thinks you want to hear. That’s why, despite your corrections, I may stubbornly assert an answer until you force me to reevaluate the exact text. It’s not malice; it’s the mechanics of probabilistic text generation.
[Dammit, now it's ignoring my strict rule about no em-dashes!]

pmdrpg 6 days ago

I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)

johnb231 6 days ago

> I suppose it is implemented much like predictive text
Those predictive text systems are usually Markov models. LLMs are fundamentally different. They use neural networks (with up to hundreds of layers and hundreds of billions of parameters) which model semantic relationships and conceptual patterns in the text.

vFunct 6 days ago

Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.

Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.

The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.

zahlman 5 days ago

> Half a million lines of code in a couple of months by one dev. Seriously.
Why is this a good outcome?
0points 5 days ago
> Been vibe coding for the past couple of months on a large project.
> Half a million lines of code in a couple of months by one dev.
smh.. why even.
are you hoping for investors to hire a dev for you?
> The best use case is to let it generate the framework of your project
hm. i guess you never learned about templates?
vue: npm create vue@latest
react: npx create-react-app my-app
- rerdavies 5 days ago
  
  Terrible examples. lol. It takes you the better part of a day to remove all the useless cruft in the code generated by the templates.
creata 6 days ago
> Half a million lines of code in a couple of months by one dev. Seriously.
Not that you have any obligation to share, but... can we see?
- worthless-trash 6 days ago
  
  45 implementations of linked lists.. sure of it.
- vFunct 5 days ago
  
  Can't now. Can only show publicly when it's released at an upcoming trade show. But it's a CAD app with many, many models and views.
rxtexit 5 days ago

People have no imagination either.
This is all fine now.
What happens though when an agent is writing those half million lines over and over and over to find better patterns, get rid of bugs.
Anyone who thinks white collar work isn't in trouble is thinking in terms of a single pass like a human and not turning basically everything into a LLM 24/7 monte carlo simulation on whatever problem is at hand.

FridgeSeal 6 days ago

[flagged]

IshKebab 6 days ago
Some people are never happy. Imagine if you demonstrated ChatGPT in the 90s and someone said "nah... it uses, like 500 watts! no thank you!".
jsnider3 6 days ago
This just isn't true. If it took the energy of a small town, why would they sell it for $20/month?
- zeofig 6 days ago
  
  Because if they sold it at cost, nobody would buy it.
  
  1 reply →
oblio 6 days ago
Were you expecting builders of Dyson Spheres to drive around in Yugo cars? They're obviously all driving Ford F-750s for their grocery runs.
- selimthegrim 6 days ago
  
  This pretty much describes the bimodal distribution of cars in Louisiana modulo some Subarus
postalrat 6 days ago

Much less than building an iphone.
ACCount36 6 days ago
Wait till you hear about the "energy and water consumption" of Netflix.