Comment by mywittyname

1 day ago

> As much as I hate to admit it, step one in most of my projects now is to ask AI about it. Maybe it’ll tell me something I don’t know.

Or, more likely, it will tell you something it doesn't know.

Reminds me of yesterday, when I was arguing with ChatGPT that the 5070TI was an actual video card. It kept trying to correct me by saying I must have meant a 4070ti, since no such 5070ti card exists.

29 comments

mywittyname

collabs 1 day ago

Or, it will acknowledge that it made a mistake and continue to make the same mistake again.

I asked Claude to generate an HTML page about PowerShell 7. It gave me a page saying 7.4 was the latest LTS release. I corrected it with links showing 7.6 was released in March and asked it to regenerate with the latest information.

It generated basically the same page with the same claim that 7.4 was the latest release.

ericmay 1 day ago
> Or, it will acknowledge that it made a mistake and continue to make the same mistake again.
People do this too though. At least the AI generally tries to follow instructions that you give it even when you are lacking clarity in the details.
I feel like it's similar to the self-driving car problem. The car could have 99.9999% reliability, drive much better and safer than a human, yet folks will still freak out about a single mistake that's made even though you have actual humans today driving the wrong way down the highway, crashing in to buildings, drunk driving, stealing cars, and all sorts of other just absolutely stupid things.
We need to move away from this idea that because it's an AI system it should give you perfect responses. It's not a deterministic system and it can be wrong, though it should get better over time. Your Google search results are wrong all the time too. The NYT writes things that are factually incorrect. Why do we have such a high standard for these models when we don't apply them elsewhere?
- bryceacc 1 day ago
  
  >I corrected it with links
  it should be reasonably expected that you can give a source and fix an error in the AI output.
  I would even go as far as to say if a human directly told the AI "no, use 7.6 as the latest version", the AI should absolutely follow direct instructions no matter what it thinks is true. What if this human was working on a slide about the upcoming release of 7.6 that has no public documentation?
- applfanboysbgon 21 hours ago
  
  > Your Google search results are wrong all the time too. The NYT writes things that are factually incorrect.
  This is also very bad and people complain about these things all the fucking time.
  > Why do we have such a high standard for these models
  Because Altman and Amodei are defrauding investors out of hundreds of billions of dollars on the promise that they will replace the entire workforce. Of course people are going to point out the emperor has no clothes when half of our society is engaged in mass hysteria worshipping these fucking things as the next industrial revolution, diverting massive amounts of resources to them, and ruining HN with 10 articles on the front page per day about how software engineering is dead.
  
  3 replies →
- xp84 17 hours ago
  
  I see a lot of angry responses in your replies, but I do think you have a good point. It seems like those arguing with you are mostly vigorously opposing a strawman: the idea that AI is perfect and that trusting AI to be perfect is the right move. Only crazy people think that, though.
  For me, I ask AI questions about taxes and my health all the time. In the case of taxes, getting a basic handle on the relevant tax law is made 1000 times easier. I can always refer directly to the IRS publications to verify, once I know what I’m looking for.
  For health, frankly, it would be impractical for me to ever get as much useful information from doctors as what I can easily get from AI. Four years ago, I would have a bunch of health questions and simply never know the answers to any of them because I would have nobody to ask. Now I get them all answered, and if I were to be suggested to actually do anything that sounded even slightly risky I’d go to the doctor, armed with much more context than I had before, to verify it.
- reaperducer 21 hours ago
  
  The NYT writes things that are factually incorrect. Why do we have such a high standard for these models when we don't apply them elsewhere?
  The New York Times publishes a "corrections" section in each issue. Let me know where I can view the 60TB file where ChatGPT fesses up to its daily fails.
  
  1 reply →
dakolli 19 hours ago

But people want to do their taxes with these things lmfao

corry 1 day ago

LLMs are (broadly-speaking) poorly-positioned to give you a strong verdict on plausibility of a frontier topic. That said - ChatGPT was exactly right in its response to OP!

"Very deep", "border-line impractical" "in a research-sense" is the perfect summary of this article itself! :)

funimpoded 1 day ago

Watching the entire economy of a superpower and ~all of online culture go absolutely ga-ga over Furbys has been one of the weirdest things I've ever witnessed.

dakolli 19 hours ago

Watching the entire economy of a superpower bet its entire future on SOTA text autocomplete models has been interesting to watch (which I think you're referring too).
Previous Empires naively bet their entire future on the words of magicians, or people who claimed they could look into water, the sky and fire and tell you what the future is going to be.
Machine Learning Engineers are the modern day Empire's court magician.
Apocryphon 21 hours ago

Eh, in this use case it's more like a goofy search engine.

perarneng 1 day ago

This is why i use grok expert mode. It agressivly goes out searching the web for info. Its so much better then relying on year old data.

_blk 1 day ago
Yes, I really like that about Grok. It had a few good qualities but it was too verbose so now it's mostly Claude.
- JumpCrisscross 1 day ago
  
  Solid compromise is Kagi's research assistant. Aggressively cites, unlike Claude. Concise, unlike Grok.

Tsiklon 20 hours ago

I argued with GPT-OSS 120B about cascade lake Xeon workstation CPU parts not having a GPU when it vehemently said otherwise

amluto 1 day ago

At least ChatGPT is now aware that Codex exists. I have a chat, still in my history, from a few months ago, in which I asked for help wrangling npm to get @openai/codex working, and ChatGPT said:

> Important: Codex CLI no longer exists

> OpenAI discontinued the Codex model + CLI a while back. There is no official binary named codex in any current OpenAI npm packages. OpenAI’s current CLI tool is:

    npm install -g openai

> which installs the openai command, not codex.

The world knowledge of these models is not necessarily up to date :)

edit: I replayed the same prompt into current ChatGPT and it is less clueless now. Maybe OpenAI noticed that it was utterly dumb that GPT-5.whatever didn't believe that Codex existed and fine-tuned it.

sigmoid10 1 day ago
>The world knowledge of these models is not necessarily up to date :)
It's amazing how this still needs to be said. Codex was released in April 2025. The initial GPT-5 and 5.1 still had a knowledge cutoff in late 2024. Like, what did you expect? Always beware the knowledge cutoff for LLMs (although recent releases have gotten much better with researching the web for updates before answering modern software topics).
- z2 21 hours ago
  
  OpenAI being more aware of the implications would help too--last year I also struggled with using Codex to write scripts to run Codex headless, because it kept insisting that Codex was a retired model from the GPT-3 days and not a program that could be called by a script.

simonh 1 day ago

It’s training data only goes up to late 2024 or early 2025 so that might be why, though it does have access to the internet.

mywittyname 1 day ago
Yeah, the solution was to link it to the nvidia page of the card, then it was like, 'oh, okay.' But at that point, I lost faith in it's ability to provide me with the information I was looking for. If it's information is so out of date that it doesn't know about the 5000 series, how could I be confident that it knew the details I was asking about (game engine related research)?
- asats 1 day ago
  
  Are you using the instant model?
  
  3 replies →
weird-eye-issue 1 day ago

Depending on your ChatGPT settings...