← Back to context

Comment by synapsomorphy

1 month ago

I'm honestly kind of surprised there haven't been significant large-scale attempts to well-poison LLMs with certain viewpoints/beliefs/whatever. Maybe we just haven't caught them.

There was a proof-of-concept paper about buying up expired domains in the LAION image dataset and poisoning multimodal LLMs that way (then LAION was just a list of image URLs). As I understand it, the paper was exaggerating its reach, and LAION has newer versions, torrents, etc.

ChatGPT 5 still says "My knowledge cutoff is June 2024"

There is a reason these models are still operating on old knowledge cutoff dates