I had email correspondence once with a vendor about how to talk to their i2c bus. The documentation was all asm, and I wanted to at least “uplift” to C. They didn’t have any answers, so I sent them my solution which was was the asm calls that the c stdlib decompiled into.
4 years later my company had bought a different company, who happened to be using a newer model of the same board. They asked me how we could use the 12c bus. “Well before you bought us, we emailed the vendor and sent back this C snippet”
It was my code, verbatim. I’ve always wondered how many times they passed that bit of code around.
Claude does it quite a bit when you’re triggering the search tool functions.
It’s fine, and what you would expect for certain prompts, except that the synthesized results often come back communicating more authority than they deserve.
It was funny for me, when I asked it about something specific exotic - and it gave me a confident answer. But checking the sources I discovered it was from my own inquiries on a forum thread about it from the last time I unsuccesfully tried this (before the agents came) And so I knew, that any authorative tone was undeserved.
On the other hand, Claude later nailed this project, where I as a human said before, no, too much extra work.
Or Reddit. I don't know about Claude but Gemini has given me answers that are verbatim comments from Reddit.
I've gotten my own answers given back to me for problems I forgot I already had.
I had email correspondence once with a vendor about how to talk to their i2c bus. The documentation was all asm, and I wanted to at least “uplift” to C. They didn’t have any answers, so I sent them my solution which was was the asm calls that the c stdlib decompiled into.
4 years later my company had bought a different company, who happened to be using a newer model of the same board. They asked me how we could use the 12c bus. “Well before you bought us, we emailed the vendor and sent back this C snippet”
It was my code, verbatim. I’ve always wondered how many times they passed that bit of code around.
Somebody posted a similar comment above yours (somewhere...). I don't think your experience is unique!
Claude does it quite a bit when you’re triggering the search tool functions.
It’s fine, and what you would expect for certain prompts, except that the synthesized results often come back communicating more authority than they deserve.
It was funny for me, when I asked it about something specific exotic - and it gave me a confident answer. But checking the sources I discovered it was from my own inquiries on a forum thread about it from the last time I unsuccesfully tried this (before the agents came) And so I knew, that any authorative tone was undeserved.
On the other hand, Claude later nailed this project, where I as a human said before, no, too much extra work.
I've gotten this too a lot. If you ask AI to cite where it got info you can lose a lot of confidence in it pretty quickly.
I have seen it quote my own code back at me, including comments word for word.
Yeah, that's basically it.
In robotics there is no free lunch dataset. You'll have to gather it yourself, but if you do that, you run into an obvious problem: labeling.
With SO, you literally have the best possible scenario, because the data is clearly structured and separated into prompt and answer (aka label).
Your objective function is literally "Say what he said".
I've seen chatgpt word for word plagiarise stack overflow answers
i've seen it plagiarize personal blog posts too, almost verbatim code line by line.... kind of shocking...
Well, it's a plagiarizing machine after all and most of the time it remixes it well enough so most people can't tell.