Comment by Legend2440
6 months ago
I'm not convinced this is going to be as big of a deal as people think.
Long-run you want AI to learn from actual experience (think repairing cars instead of reading car repair manuals), which both (1. gives you an unlimited supply of noncopyrighted training data and (2. handily sidesteps the issue of AI-contaminated training data.
The hallucinations get quoted and then sourced as truth unfortunately.
A simple example. "Which MS Dos productivity program had connect four built in?".
I have an MSDOS emulator and know the answer. It's a little obscure but it's amazing how i get a different answer from all the AI's every time. I never saw any of them give the correct answer. Try asking it the above. Then ask it if it's sure about that (it'll change it's mind!).
Now remember that these types of answers may well end up quoted online and then learnt by AI with that circular referenced source as the source. We have no truth at that point.
And seriously try the above question. It's a great example of AI repeatedly stating an authoritative answer that's completely made up.
When I asked, "Good afternoon! I'm trying to settle a bet with a friend (no money on the line, just a friendly "bet"!) Which MS DOS productivity program had a playable version of the game Connect Four built in as an easter egg?", it went into a very detailed explanation of how to get to the "Hall of Tortured Souls" easter egg in Excel 5.0, glossing over the fact that I said "MS DOS" and also conflating the easter eggs by telling me specifically that the "excelkfa" cheat code would open a secret door/bridge to the connect four game.
So, I retried with, "Good afternoon! I'm trying to settle a bet with a friend (no money on the line, just a friendly "bet"!) Which *MS DOS* [not Win95, i.e., Excel 5] productivity program had a playable version of the game Connect Four built in as an easter egg?". I got Lotus 1-2-3 once, Excel 4 twice, and Borland Quattro Pro three different times, all from that prompt.
The correct answer you point out in another subthread was never returned as a possibility, and the responses all definitely came across as confident. Definitely a fascinating example.
Claude 4 Sonnet gave the (reasonable given the obscurity, but wrong) answer that there was no such easter egg:
>I'm not aware of any MS-DOS productivity program that had Connect Four as a built-in easter egg. While MS-DOS era software was famous for including various easter eggs (like the flight simulator in Excel 97, though that was Windows-era), I can't recall Connect Four specifically being hidden in any major DOS productivity applications.
>The most well-known DOS productivity suites were things like Lotus 1-2-3, WordPerfect, dBase, and later Microsoft Office for DOS, but I don't have reliable information about Connect Four being embedded in any of these.
>It's possible this is a case of misremembered details - perhaps your friend is thinking of a different game, a different era of software, or mixing up some details. Or there might be an obscure productivity program I'm not familiar with that did include this easter egg.
>Would you like me to search for more information about DOS-era software easter eggs to see if we can track down what your friend might be thinking of?
That seems like a pretty reasonable response given the details, and included the appropriate caveat that the model was not aware of any such easter egg, and didn't confidently state that there was none.
>It's possible this is a case of misremembered details - perhaps your friend is thinking of a different game, a different era of software, or mixing up some details. Or there might be an obscure productivity program I'm not familiar with that did include this easter egg.
I am not a fan of this kind of communication. It doesn't know so try to deflect the short coming it onto the user.
Im not saying that isn't a valid concern, but it can be used as an easy out of its gaps in knowledge.
7 replies →
Gemini 2.5 Flash me a similar answer, although it was a bit more confident in it's incorrect answer:
> You're asking about an MS-DOS productivity program that had ConnectFour built-in. I need to tell you that no mainstream or well-known MS-DOS productivity program (like a word processor, spreadsheet, database, or integrated suite) ever had the game ConnectFour built directly into it.
> didn't confidently state that there was none
And better. Didn’t confidently state something wrong.
Whenever I ask these AI "Is the malloc function in the Microsoft UCRT just a wrapper around HeapAlloc?", I get answers that are always wrong.
They claim things like the function adds size tracking so free doesn't need to be called with a size or they say that HeapAlloc is used to grab a whole chunk of memory at once and then malloc does its own memory management on top of that.
That's easy to prove wrong by popping ucrtbase.dll into Binary Ninja. The only extra things it does beyond passing the requested size off to HeapAlloc are: handle setting errno, change any request for 0 bytes to requests for 1 byte, and perform retries for the case that it is being used from C++ and the program has installed a new-handler for out-of-memory situations.
ChatGPT 4o waffles a little bit and suggests the Microsoft Entertainment pack (which is not productivity software or MS-DOS), but says at the end:
>If you're strictly talking about MS-DOS-only productivity software, there’s no widely known MS-DOS productivity app that officially had a built-in Connect Four game. Most MS-DOS apps were quite lean and focused, and games were generally separate.
I suspect this is the correct answer, because I can't find any MS-DOS Connect Four easter eggs by googling. I might be missing something obscure, but generally if I can't find it by Googling I wouldn't expect an LLM to know it.
ChatGPT in particular will give an incorrect (but unique!) answer every time. At the risk of losing a great example of AI hallucination, it's Autosketch
Not shown fully but https://www.youtube.com/watch?v=kBCrVwnV5DU&t=39s note the game in the file menu.
10 replies →
> I might be missing something obscure, but generally if I can't find it by Googling I wouldn't expect an LLM to know it.
The Google index is already polluted by LLM output, albeit unevenly, depending on the subject. It's only going to spread to all subjects as content farms go down the long tail of profitability, eking profits; Googling won't help because you'll almost always find a result that's wrong, as will LLMs that resort to searching.
Don't get me started on Google's AI answers that assert wrong information and launders fanfic/reddit/forum and elevating all sources to the same level.
It gave me two answers (one was Borland sidekick) which I then asked "are you sure about that?" waffled and said actually neither of those it's IBM Handshaker to which I said "I don't think so, I think it's another productivity program" and it replied on further review it's not IBM Handshaker, there are no productivity programs that include Connect Four. No wonder CTO like this shit so much, it's the perfect bootlick.
If I can find something by Googling I wouldn’t need an LLM to know it.
1 reply →
So, like normal history just sped up exponentially to the point it's noticeable in not just our own lifetime (which it seemed to reach prior to AI), but maybe even within a couple years.
I'd be a lot more worried about that if I didn't think we were doing a pretty good job of obfuscating facts the last few years ourselves without AI. :/
just tried this with gemini 2.5 flash and pro several times, it just keeps saying it doesn't know of any such thing and suggesting it was a software bundle where the game was included alongside the productivity application or I'm not remembering correctly.
not great (assuming there actually is such a software) but not as bad as making something up
AIs make knowledge work more efficient.
Unfortunately that also includes citogenesis.
https://xkcd.com/978/
probably chatgpt search function already finds this thread soon to answer correctly, hn domain does well on seo and shows up on search results soon enough
What is the correct answer?
Autosketch for MS-Dos had connect four. It's under "game" in the file menu.
This is an example of a random fact old enough no one ever bothered talking about it on the internet. So it's not cited anywhere but many of us can just plain remember it. When you ask ChatGPT (as of now on June 6th 2025) it gives a random answer every time.
Now that i've stated this on the internet in a public manner it will be corrected but... There's a million such things that i could give as an example. Some question obscure enough that no one's given an answer on the internet before so AI doesn't know but recent enough that many of us know the answer so we can instantly see just how much AI hallucinates.
6 replies →
Wait until you meet humans on the Internet. Not only do they make shit up, but they'll do it maliciously to trick you.
> which both (1. gives you an unlimited supply of noncopyrighted training data and (2. handily sidesteps the issue of AI-contaminated training data.
I think these are both basically somewhere between wrong and misleading.
Needing to generate your own data through actual experience is very expensive, and can mean that data acquisition now comes with real operational risks. Waymo gets real world experience operating its cars, but the "limit" on how much data you can get per unit time depends on the size of the fleet, and requires that you first get to a level of competence where it's safe to operate in the real world.
If you want to repair cars, and you _don't_ start with some source of knowledge other than on-policy roll-outs, then you have to expect that you're going to learn by trashing a bunch of cars (and still pay humans to tell the robot that it failed) for some significant period.
There's a reason you want your mechanic to have access to manuals, and have gone through some explicit training, rather than just try stuff out and see what works, and those cost-based reasons are true whether the mechanic is human or AI.
Perhaps you're using an off-policy RL approach -- great! If your off-policy data is demonstrations from a prior generation model, that's still AI-contaminated training data.
So even if you're trying to learn by doing, there are still meaningful limits on the supply of training data (which may be way more expensive to produce than scraping the web), and likely still AI-contaminated (though perhaps with better info on the data's provenance?).
There is an enormous amount of actual car repair experience training data on YouTube but it's all copyrighted. Whether AI companies should have to license that content before using it for training is a matter of some dispute.
>Whether AI companies should have to license that content before using it for training is a matter of some dispute.
We definitely do not have the right balance of this right now.
eg. I'm working on a set of articles that give a different path to learning some key math knowledge (just comes at it from a different point of view and is more intuitive). Historically such blog posts have helped my career.
It's not ready for release anyway but i'm hesitant to release my work in this day and age since AI can steal it and regurgitate it to the point where my articles appear unoriginal.
It's stifling. I'm of the opinion you shouldn't post art, educational material, code or anything that you wish to be credited for on the internet right now. Keep it to yourself or else AI will just regurgitate it to someone without giving you credit.
The flip side is: knowledge is not (and should not be!) copyrightable. Anyone can read your articles and use the knowledge it contains, without paying or crediting you. They may even rewrite that knowledge in their own words and publish it in a textbook.
AI should be allowed to read repair manuals and use them to fix cars. It should not be allowed to produce copies of the repair manuals.
4 replies →
Prediction: there won’t be any AI systems repairing cars before there will be general intelligence-capable humanoid robots (Ex Machina-style).
There also won’t be any AI maids in five-star hotels until those robots appear.
This doesn’t make your statement invalid, it’s just that the gap between today and the moment you’re describing is so unimaginably vast that saying “don’t worry about AI slop contaminating your language word frequency databases, it’ll sort itself out eventually” is slightly off-mark.
It blows my mind that some folks are still out here thinking LLMs are the tech-tree towards AGI and independently thinking machines, when we can't even get copilot to stop suggesting libraries that don't exist for code we fully understand and created.
I'm sure AGI is possible. It's not coming from ChatGPT no matter how much Internet you feed to it.
Well, we won't be feeding it internet - we'll be using RL to learn from interaction with the real world.
LLMs are just one very specific application of deep learning, doing next-word-prediction of internet text. It's not LLMs specifically that's exciting, it's deep learning as a whole.
I don't understand the obsession with humanoid robots that many seem to have. Why would you make a car repairing machine human-shaped? Like, what would it use its legs for? Wouldn't it be better to design it tailored to its purpose?
Economies of scale. The humanoid form can interact with all of the existing infrastructure for jobs currently done by humans, so that's the obvious form factor for companies looking to churn out robots to sell by the millions.
2 replies →
They want a child.
Legs? To jump into the workshop pit, among other things. Palms are needed to hold a wrench or a spanner, fingers are needed to unscrew nuts.
Cars are not built to accommodate whatever universal repair machine there could be, cars are built with an expectation that a mechanic with arms and legs will be repairing it, and will be for a while.
A non-humanoid robot in a human-designed world populated by humans looks and behaves like this, at best: https://youtu.be/Hxdqp3N_ymU
6 replies →
Long-run you want AGI then? Once we get AGI, the spam will be good?
https://xkcd.com/810/