Comment by Imustaskforhelp

3 hours ago

I don't really know but I don't think most people know it.

I have had passwords accidentally be pasted into chatgpt if I were using my bitwarden password manager sometimes and then had them be removed and I thought I was okay

It is scary that I am pretty familiar with tech and I knew it was possible but I thought that for privacy they wouldn't. I feel like the general public might be even more oblivious.

Also a quick question but how long are the logs kept in OpenAI? And are the logs still taken even if you are in private mode?

> I don't really know but I don't think most people know it.

That's for sure, most people don't know how much they're being tracked, even if we consider only inside the platform. Nowadays, lots of platforms literally log your mouse movements inside the page, so they can see exactly where you first landed, how you moved around on the page, where you navigated, how long you paused for, and much much more. Basically, if it can be logged and re-constructed, it will be.

> Also a quick question but how long are the logs kept in OpenAI? And are the logs still taken even if you are in private mode?

As far as I know right now, OpenAI is under legal obligation to log all of their ChatGPT chats, regardless of their own policies, but this was a while ago (this summer sometime?), maybe it's different today.

What exactly you mean with "private mode"? If you mean "incognito/private window" in your browser, it has basically no impact on how much is logged by the platforms themselves, it's all about your local history.

For the "temporary mode" in ChatGPT, I also think it has no impact on how much they log, it's just about not making that particular chat visible in your chat history, and them not using that data for training their model. Besides that, all the tracking in your browser still works the same way, AFAIK.

  • Wow thanks for your response man.

    I was referring to temporary mode when I was saying (but I also considered private window to be much safe as well but wow looks like they log literally everything)

    So out of all providers, gemini,claude,openAI,grok and others? Do they all log everything permanently?

    If they are logging everything, what prevents their logs from getting leaked or "accidentally" being used in training data?

    > As far as I know right now, OpenAI is under legal obligation to log all of their ChatGPT chats, regardless of their own policies, but this was a while ago (this summer sometime?), maybe it's different today.

    I also remember this post and from the current political environment, that's kind of crazy.

    Also some of these services require a phone number one way or other and most likely there is a way the phone number can somehow be linked to logs, then since phone numbers are released by govt., usually chances are that if threat actors want data on large & OpenAI contributes to them, a very good profile of a person can be built if they use such services... Wild.

    So if OpenAI"s under legal obligation, is there a limit for how long to keep the logs or are they gonna keep it permanently? I am gonna look for the old article from HN right now but if the answer is permanently, then its even more dystopian than I imagined.

    The mouse sharing ability is wild too. I might use librewolf at this point to prevent some of such tracking

    Also what are your thoughts on the new anonymous providers like confer.to (by signal creator), venice.ai etc.? (maybe some openrouter providers?)

    • You can safely assume (and probably better you do regardless) that everyone on the internet is logging and slurping up as much data as they can about their users. Their product teams usually is the one who is using the data, but depending on the amount of controls in the company, could be that most of it sits in a database both engineering, marketing and product team has access to.

      > If they are logging everything, what prevents their logs from getting leaked or "accidentally" being used in training data?

      The "tracking data" is different from "chat data", the tracking data is usually collected for the product team to make decisions with, and automatically collected in the frontend and backend based on various methods.

      The "chat data" is something that they'd keep more secret and guarded typically, probably random engineers won't be able to just access this data, although seniors in the infrastructure team typically would be able to.

      As for easy or not that data could slip into training data, I'm not sure, but I'd expect just the fear of big name's suing them could be enough for them to be really careful with it. I guess that's my hope at least.

      I don't know any specific "how long they keep logs" or anything like that, but what I do know, is that typically you try to sit on your data for as long as you can, because you always end up finding new uses for it in the future. Maybe you wanna compare how users used the platform in 2022 vs 2033, and then you'd be glad, so unless the company has some explicit public policy about it, assume they sit on it "forever".

      > Also what are your thoughts on the new anonymous providers like confer.to (by signal creator), venice.ai etc.? (maybe some openrouter providers?)

      Haven't heard about any of them :/ This summer I took it one step further and got myself the beefiest GPU I could reasonably get (for unrelated purposes) and started using local models for everything I do with LLMs.

      1 reply →