Comment by ALLTaken
2 months ago
I made a video of it with a friend. The repository is of a large corporate automative industry company. I also have my own private repositories which were always private and OpenAI printed my files in the first prompt. When I prompted again it acted as if it didn't know. But my friend tried on his account and could access the Corp and my private repository without ever being linked.
The Corporate repository was of Volkswagen. It's quite serious of a breach. I only gave it the name of the repository and it printed the files, which shouldn't be possible.
Maybe OpenAI exploits Microsoft to access GitHub fully to train their AI on all of humanity's code for free, violating privacy, security, IP and copyright.
>I only gave it the name of the repository and it printed the files, which shouldn't be possible.
Are you sure these weren't just plausible guesses at file names? It's just a hallucination.
I asked it for the list of files in some public repositories (which are definitely in the training data) and it gave me a plausible-but-wrong list of files. It can't remember that kind of detail.
I am sure about that, I also thought about hallucination, but it was precise in the first prompt. 2nd and follow-ups had plausible denial and created similar, but clearly different code.
It could even print the list of files and their exact names if triggered right. They may have surely patched that, so that nobody sues them with digital proof. But we recorded it. It was when their new model came out. Don't remember the date, but a few months ago. We have two videos and different repositories it should not have access to at all.
Microsoft owns GitHub. OpenAI has a multi-billion dollar investment from Microsoft and access to their Infrastructure "for training" and seems likely, they got access to GitHub. Something that they shouldn't do, since that's illegal and very unethical.