Comment by ceejayoz

13 days ago

Much of what Claude insists you do probably came from SO.

12 comments

ceejayoz

akkad33 13 days ago

Or Reddit. I don't know about Claude but Gemini has given me answers that are verbatim comments from Reddit.

arcanemachiner 13 days ago
I've gotten my own answers given back to me for problems I forgot I already had.
- irishcoffee 13 days ago
  
  I had email correspondence once with a vendor about how to talk to their i2c bus. The documentation was all asm, and I wanted to at least “uplift” to C. They didn’t have any answers, so I sent them my solution which was was the asm calls that the c stdlib decompiled into.
  4 years later my company had bought a different company, who happened to be using a newer model of the same board. They asked me how we could use the 12c bus. “Well before you bought us, we emailed the vendor and sent back this C snippet”
  It was my code, verbatim. I’ve always wondered how many times they passed that bit of code around.
- mcswell 13 days ago
  
  Somebody posted a similar comment above yours (somewhere...). I don't think your experience is unique!
dd8601fn 13 days ago
Claude does it quite a bit when you’re triggering the search tool functions.
It’s fine, and what you would expect for certain prompts, except that the synthesized results often come back communicating more authority than they deserve.
- lukan 13 days ago
  
  It was funny for me, when I asked it about something specific exotic - and it gave me a confident answer. But checking the sources I discovered it was from my own inquiries on a forum thread about it from the last time I unsuccesfully tried this (before the agents came) And so I knew, that any authorative tone was undeserved.
  On the other hand, Claude later nailed this project, where I as a human said before, no, too much extra work.
Morromist 13 days ago

I've gotten this too a lot. If you ask AI to cite where it got info you can lose a lot of confidence in it pretty quickly.
worthless-trash 13 days ago

I have seen it quote my own code back at me, including comments word for word.

imtringued 12 days ago

Yeah, that's basically it.

In robotics there is no free lunch dataset. You'll have to gather it yourself, but if you do that, you run into an obvious problem: labeling.

With SO, you literally have the best possible scenario, because the data is clearly structured and separated into prompt and answer (aka label).

Your objective function is literally "Say what he said".

20k 13 days ago

I've seen chatgpt word for word plagiarise stack overflow answers

andrekandre 13 days ago
i've seen it plagiarize personal blog posts too, almost verbatim code line by line.... kind of shocking...
- tartoran 13 days ago
  
  Well, it's a plagiarizing machine after all and most of the time it remixes it well enough so most people can't tell.