Comment by ipsum2

18 days ago

If you use this, all of your code data will be sent to Google. From their terms:

https://developers.google.com/gemini-code-assist/resources/p...

When you use Gemini Code Assist for individuals, Google collects your prompts, related code, generated output, code edits, related feature usage information, and your feedback to provide, improve, and develop Google products and services and machine learning technologies.

To help with quality and improve our products (such as generative machine-learning models), human reviewers may read, annotate, and process the data collected above. We take steps to protect your privacy as part of this process. This includes disconnecting the data from your Google Account before reviewers see or annotate it, and storing those disconnected copies for up to 18 months. Please don't submit confidential information or any data you wouldn't want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.

It's a lot more nuanced than that. If you use the free edition of Code Assist, your data can be used UNLESS you opt out, which is at the bottom of the support article you link to:

"If you don't want this data used to improve Google's machine learning models, you can opt out by following the steps in Set up Gemini Code Assist for individuals."

and then the link: https://developers.google.com/gemini-code-assist/docs/set-up...

If you pay for code assist, no data is used to improve. If you use a Gemini API key on a pay as you go account instead, it doesn't get used to improve. It's just if you're using a non-paid, consumer account and you didn't opt out.

That seems different than what you described.

  • your data can be used UNLESS you opt out

    It's even more nuanced than that.

    Google recently testified in court that they still train on user data after users opt out from training [1]. The loophole is that the opt-out only applies to one organization within Google, but other organizations are still free to train on the data. They may or may not have cleaned up their act given that they're under active investigation, but their recent actions haven't exactly earned them the benefit of the doubt on this topic.

    [1] https://www.business-standard.com/technology/tech-news/googl...

    • Another dimension here is that any "we don't train on your data" is useless without a matching data retention policy which deletes your data. Case and point of 23andMe not selling your data until they decided to change that policy.

      4 replies →

    • This is incorrect. The data discussed in court is data freely visible on the web, not user data that the users sent to Google.

      If the data is sent by a user to sub-unit X of Google, and X promised not to use it for training, it implies that X can share this data with sub-unit Y only if Y also commits not to use the data for training. Breaking this rule would get everyone in huge trouble.

      OTOH, when sub-unit X said "We promise not to use data from the public website if the website owner asks us not to", it does not imply another sub-unit Y must follow that commitment.

    • Hopefully this doesn't apply to corporate accounts where they claim to be respecting privacy via contracts

    • Reading about all the nuances is such a trigger for me. To cover your ass is one thing, to imply one thing in a lay sense and go on to do something contradicting it (in bad faith) is douchebaggery. I am very sad and deeply disappointed at Google for this. This completes their transformation to Evil Corp after repealing the “don’t be evil” clause in their code of conduct[1].

      [1] https://en.m.wikipedia.org/wiki/Don't_be_evil

  • Sorry, that's not correct. Did you check out the link? It doesn't describe the CLI, only the IDE.

    "You can find the Gemini Code Assist for individuals privacy notice and settings in two ways:

    - VS Code - IntelliJ "

    • That's because it's a bit of a nesting doll situation. As you can see here:

      https://github.com/google-gemini/gemini-cli/tree/main

      If you scroll to the bottom, it says that the terms of service are governed based on the mechanism by which you access Gemini. If you access via code assist (which the OP posted), you abide by those privacy terms of code assist, one of the ways of which you access is VScode. If you access via the Gemini API, then those terms apply.

      So the gemini CLI (as I understand it) doesn't have their own privacy terms, because it's an open source shell on top of another Gemini system, which could have one of a few different privacy policies based on how you choose to use it and your account settings.

      (Note: I work for google, but not on this, this is just my plain reading of the documentation)

      2 replies →

  • > It's a lot more nuanced than that. If you use the free edition of Code Assist, your data can be used UNLESS you opt out,

    Well... you are sending your data to a remote location that is not yours.

  • Yes, I'm right about to trust Google to do what they pinky swear.

    EDIT: Lmao, case in point, two sibling comments pointing out that Google does indeed do this anyway via some loophole; also they can just retain the data and change the policy unilaterally in the future.

    If you want privacy do it local with Free software.

To be honest this is by far the most frustrating part of the Gemini ecosystem, to me. I think 2.5 pro is probably the best model out there right now, and I'd love to use it for real work, but their privacy policies are so fucking confusing and disjointed that I just assume there is no privacy whatsoever. And that's with the expensive Pro Plus Ultra MegaMax Extreme Gold plan I'm on.

I hope this is something they're working on making clearer.

  • In my own experience, 2.5 Pro 03-26 was by far the best LLM model at the time.

    The newer models are quantized and distilled (I confirmed this with someone who works on the team), and are a significantly worse experience. I prefer OpenAI O3 and o4-mini models to Gemini 2.5 Pro for general knowledge tasks, and Sonnet 4 for coding.

  • For coding in my experience Claude Sonnet/Opus 4.0 is hands down better than Gemini 2.5. pro. I just end up fighting with Claude a lot less than I do with Gemini. I had Gemini start a project that involved creating a recursive descent parser for a language in C. It was full of segfaults. I'd ask Gemini to fix them and it would end up breaking something else and then we'd get into a loop. Finally I had Claude Sonnet 4.0 take a look at the code that Gemini had created. It fixed the segfaults in short order and was off adding new features - even anticipating features that I'd be asking for.

    • Did you try Gemini with a fresh prompt too when comparing against Claude? Sometimes you just get better results starting over with any leading model, even if it gets access to the old broken code to fix.

      I haven't tried Gemini since the latest updates, but earlier ones seemed on par with opus.

  • If I'm being cynical, it's easy to either say "we use it" or "we don't touch it" but they'd lose everyone that cares about this question if they just said "we use it" - most beneficial position is to keep it as murky as possible.

    If I were you I'd assume they're using all of it for everything forever and act accordingly.

Hey all, This is a really great discussion, and you've raised some important points. We realize the privacy policies for the Gemini CLI were confusing depending on how you log in, and we appreciate you calling that out.

To clear everything up, we've put together a single doc that breaks down the Terms of Service and data policies for each account type, including an FAQ that covers the questions from this thread.

Here’s the link: https://github.com/google-gemini/gemini-cli/blob/main/docs/t...

Thanks again for pushing for clarity on this!

  • Is there any way for a user using the "Login with Google ... for individuals" auth method (I guess auth method 1) -- to opt-out of, and prevent, their input prompts, and output responses, from being used as training data?

    From an initial parse of your linked tos-privacy.md doc, it seems like the answer is "no" -- but that seems bonkers to me, so I hope I'm misreading or misunderstanding something!

  • I think you did a good job CYA on this, but what people were really looking for was a way to opt-out of Google collecting code, similar to the opt-out process for the IDE is available.

    • Yeah how is opt out of data collection not an option? This is what they mean by don't be evil and Google is proving yet again that they truly are

      3 replies →

  • I appreciate the effort here, but I’m still confused. Is my $250/mo “ultra” plan considered personal and still something you train on?

  • Thanks, one more clarification please. The heading of point #3 seems to mention Google Workspace: "3. Login with Google (for Workspace or Licensed Code Assist users)". But the text content only talks about Code Assist: "For users of Standard or Enterprise edition of Gemini Code Assist" ... Could you clarify whether point #3 applies with login via Google Workspace Business accounts?

There is some information on this buried in configuration.md under "Usage Statistics". They claim:

*What we DON'T collect:*

- *Personally Identifiable Information (PII):* We do not collect any personal information, such as your name, email address, or API keys.

- *Prompt and Response Content:* We do not log the content of your prompts or the responses from the Gemini model.

- *File Content:* We do not log the content of any files that are read or written by the CLI.

https://github.com/google-gemini/gemini-cli/blob/0915bf7d677...

  • This is useful, and directly contradicts the terms and conditions for Gemini CLI (edit: if you use the personal account, then its governed under the Code Assist T&C). I wonder which one is true?

  • I wonder what the legal difference between "collect" and "log" is.

    • Collection means it gets sent to a server, logging implies (permanent or temporary) retention of that data. I tried finding a specific line or context in their privacy policy to link to but maybe someone else can help me provide a good reference. Logging is a form of collection but not everything collected is logged unless mentioned as such.

Mozilla and Google provide an alternative called gemmafile which gives you an airgapped version of Gemini (which Google calls Gemma) that runs locally in a single file without any dependencies. https://huggingface.co/jartine/gemma-2-27b-it-llamafile It's been deployed into production by 32% of organizations: https://www.wiz.io/reports/the-state-of-ai-in-the-cloud-2025

  • There's nothing wrong with promoting your own projects, but its a little weird that you don't disclose that you're the creator.

    • It would be more accurate to say I packaged it. llamafile is a project I did for Mozilla Builders where we compiled llama.cpp with cosmopolitan libc so that LLMs can be portable binaries. https://builders.mozilla.org/ Last year I concatenated the Gemma weights onto llamafile and called it gemmafile and it got hundreds of thousands of downloads. https://x.com/JustineTunney/status/1808165898743878108 I currently work at Google on Gemini improving TPU performance. The point is that if you want to run this stuff 100% locally, you can. Myself and others did a lot of work to make that possible.

      2 replies →

  • That is just Gemma model. Most people seek capabilities equivalent for Gemini 2.5 Pro if they want to do any kind of coding.

    • Gemma 27b can write working code in dozens of programming languages. It can even translate between languages. It's obviously not as good as Gemini, which is the best LLM in the world, but Gemma is built from the same technology that powers Gemini and Gemma is impressively good for something that's only running locally on your CPU or GPU. It's a great choice for airgapped environments. Especially if you use old OSes like RHEL5.

      4 replies →

This is just for free use (individuals), for standard and enterprise they don't use the data.

Which pretty much means if you are using it for free, they are using your data.

I don't see what is alarming about this, everyone else has either the same policy or no free usage. Hell the surprising this is that they still let free users opt-out...

They really need to provide some clarity on the terms around data retention and training, for users who access Gemini CLI free via sign-in to a personal Google account. It's not clear whether the Gemini Code Assist terms are relevant, or indeed which of the three sets of terms they link at the bottom of the README.md apply here.

Kinda a tragedy of the commons situation. Everyone wants to use these tools that must be trained on more and more code to get better, but nobody wants it to be trained on their code. Bit silly imo.

Do you honestly believe that the opt-out by Anthropic and Cursor means your code won't be used for training their models? Seems likely that they would rather just risk taking a massive fine for potentially solving software development than to let some competitor try it instead.

  • > For API users, we automatically delete inputs and outputs on our backend within 30 days of receipt or generation, except when you and we have agreed otherwise (e.g. zero data retention agreement), if we need to retain them for longer to enforce our Usage Policy (UP), or comply with the law.

    If this is due to compliance with law I wonder how they can make the zero-data-retention agreement work... The companies I've seen have this have not mention that they themself retain the data...

Insane to me there isn’t even an asterisk in the blog post about this. The data collection is so over the top I don’t think users suspect it because it’s just absurd. For instance Gemini Pro chats are trained on too.

If this is legal, it shouldn’t be.

How does that compare to Claude Code? How protected are you when using CC?

>If you use this, all of your code data will be sent to Google.

Not if you pay for it.

  • >If you use this, all of your code data will be sent to Google.

    Not if you pay for it.

    Today.

    In six months, a "Terms of Service Update" e-mail will go out to an address that is not monitored by anyone.

    • Sure but then you can stop paying.

      There's also zero chance they will risk paying customers by changing this policy.

good to know, I won't be using this. Curious, do you know if OpenAI codex and Claude also do the same? I was under the impression that they don't share code.

Who cares, software has no value anymore.

  • Sarcasm? Weird statement if not.

    I still have yet to replace a single application with an LLM, except for (ironically?) Google search.

    I still use all the same applications as part of my dev work/stack as I did in the early 2020's. The only difference is occasionally using an LLM baked into to one of them but the reality is I don't do that much.