Comment by ipsum2

18 days ago

If you use this, all of your code data will be sent to Google. From their terms:

https://developers.google.com/gemini-code-assist/resources/p...

When you use Gemini Code Assist for individuals, Google collects your prompts, related code, generated output, code edits, related feature usage information, and your feedback to provide, improve, and develop Google products and services and machine learning technologies.

To help with quality and improve our products (such as generative machine-learning models), human reviewers may read, annotate, and process the data collected above. We take steps to protect your privacy as part of this process. This includes disconnecting the data from your Google Account before reviewers see or annotate it, and storing those disconnected copies for up to 18 months. Please don't submit confidential information or any data you wouldn't want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.

108 comments

ipsum2

mattzito 18 days ago

It's a lot more nuanced than that. If you use the free edition of Code Assist, your data can be used UNLESS you opt out, which is at the bottom of the support article you link to:

"If you don't want this data used to improve Google's machine learning models, you can opt out by following the steps in Set up Gemini Code Assist for individuals."

and then the link: https://developers.google.com/gemini-code-assist/docs/set-up...

If you pay for code assist, no data is used to improve. If you use a Gemini API key on a pay as you go account instead, it doesn't get used to improve. It's just if you're using a non-paid, consumer account and you didn't opt out.

That seems different than what you described.

foob 18 days ago
your data can be used UNLESS you opt out
It's even more nuanced than that.
Google recently testified in court that they still train on user data after users opt out from training [1]. The loophole is that the opt-out only applies to one organization within Google, but other organizations are still free to train on the data. They may or may not have cleaned up their act given that they're under active investigation, but their recent actions haven't exactly earned them the benefit of the doubt on this topic.
[1] https://www.business-standard.com/technology/tech-news/googl...
- TrainedMonkey 18 days ago
  
  Another dimension here is that any "we don't train on your data" is useless without a matching data retention policy which deletes your data. Case and point of 23andMe not selling your data until they decided to change that policy.
  
  4 replies →
- _cs2017_ 17 days ago
  
  This is incorrect. The data discussed in court is data freely visible on the web, not user data that the users sent to Google.
  If the data is sent by a user to sub-unit X of Google, and X promised not to use it for training, it implies that X can share this data with sub-unit Y only if Y also commits not to use the data for training. Breaking this rule would get everyone in huge trouble.
  OTOH, when sub-unit X said "We promise not to use data from the public website if the website owner asks us not to", it does not imply another sub-unit Y must follow that commitment.
- Melatonic 18 days ago
  
  Hopefully this doesn't apply to corporate accounts where they claim to be respecting privacy via contracts
- sheepscreek 18 days ago
  
  Reading about all the nuances is such a trigger for me. To cover your ass is one thing, to imply one thing in a lay sense and go on to do something contradicting it (in bad faith) is douchebaggery. I am very sad and deeply disappointed at Google for this. This completes their transformation to Evil Corp after repealing the “don’t be evil” clause in their code of conduct[1].
  [1] https://en.m.wikipedia.org/wiki/Don't_be_evil
- echelon 18 days ago
  
  We need to stop giving money and data to hyperscalers.
  We need open infrastructure and models.
  
  9 replies →
- vpShane 18 days ago
  
  [dead]
ipsum2 18 days ago
Sorry, that's not correct. Did you check out the link? It doesn't describe the CLI, only the IDE.
"You can find the Gemini Code Assist for individuals privacy notice and settings in two ways:
- VS Code - IntelliJ "
- mattzito 18 days ago
  
  That's because it's a bit of a nesting doll situation. As you can see here:
  https://github.com/google-gemini/gemini-cli/tree/main
  If you scroll to the bottom, it says that the terms of service are governed based on the mechanism by which you access Gemini. If you access via code assist (which the OP posted), you abide by those privacy terms of code assist, one of the ways of which you access is VScode. If you access via the Gemini API, then those terms apply.
  So the gemini CLI (as I understand it) doesn't have their own privacy terms, because it's an open source shell on top of another Gemini system, which could have one of a few different privacy policies based on how you choose to use it and your account settings.
  (Note: I work for google, but not on this, this is just my plain reading of the documentation)
  
  2 replies →
- tiahura 18 days ago
  
  As a lawyer, I'm confused.
  I guess the key question is whether the Gemini CLI, when used with a personal Google account, is governed by the broader Gemini Apps privacy settings here? https://myactivity.google.com/product/gemini?pli=1
  If so, it appears it can be turned off. However, my CLI activity isn't showing up there?
  Can someone from Google clarify?
  
  4 replies →
- fhinkel 18 days ago
  
  Sorry our docs were confusing! We tried to clear things up: https://github.com/google-gemini/gemini-cli/blob/main/docs/t...
  
  2 replies →
aflukasz 18 days ago

> It's a lot more nuanced than that. If you use the free edition of Code Assist, your data can be used UNLESS you opt out,
Well... you are sending your data to a remote location that is not yours.
andrepd 18 days ago
Yes, I'm right about to trust Google to do what they pinky swear.
EDIT: Lmao, case in point, two sibling comments pointing out that Google does indeed do this anyway via some loophole; also they can just retain the data and change the policy unilaterally in the future.
If you want privacy do it local with Free software.
- 8n4vidtmkvmk 18 days ago
  
  Do you have recommendations? I have ollama but it doesn't have built in tool support

FL410 18 days ago

To be honest this is by far the most frustrating part of the Gemini ecosystem, to me. I think 2.5 pro is probably the best model out there right now, and I'd love to use it for real work, but their privacy policies are so fucking confusing and disjointed that I just assume there is no privacy whatsoever. And that's with the expensive Pro Plus Ultra MegaMax Extreme Gold plan I'm on.

I hope this is something they're working on making clearer.

ipsum2 18 days ago
In my own experience, 2.5 Pro 03-26 was by far the best LLM model at the time.
The newer models are quantized and distilled (I confirmed this with someone who works on the team), and are a significantly worse experience. I prefer OpenAI O3 and o4-mini models to Gemini 2.5 Pro for general knowledge tasks, and Sonnet 4 for coding.
- happycube 18 days ago
  
  Gah, enforced enshittification with model deprecation is so annoying.
UncleOxidant 18 days ago
For coding in my experience Claude Sonnet/Opus 4.0 is hands down better than Gemini 2.5. pro. I just end up fighting with Claude a lot less than I do with Gemini. I had Gemini start a project that involved creating a recursive descent parser for a language in C. It was full of segfaults. I'd ask Gemini to fix them and it would end up breaking something else and then we'd get into a loop. Finally I had Claude Sonnet 4.0 take a look at the code that Gemini had created. It fixed the segfaults in short order and was off adding new features - even anticipating features that I'd be asking for.
- cma 18 days ago
  
  Did you try Gemini with a fresh prompt too when comparing against Claude? Sometimes you just get better results starting over with any leading model, even if it gets access to the old broken code to fix.
  I haven't tried Gemini since the latest updates, but earlier ones seemed on par with opus.
dmbche 18 days ago

If I'm being cynical, it's easy to either say "we use it" or "we don't touch it" but they'd lose everyone that cares about this question if they just said "we use it" - most beneficial position is to keep it as murky as possible.
If I were you I'd assume they're using all of it for everything forever and act accordingly.

fhinkel 18 days ago

Hey all, This is a really great discussion, and you've raised some important points. We realize the privacy policies for the Gemini CLI were confusing depending on how you log in, and we appreciate you calling that out.

To clear everything up, we've put together a single doc that breaks down the Terms of Service and data policies for each account type, including an FAQ that covers the questions from this thread.

Here’s the link: https://github.com/google-gemini/gemini-cli/blob/main/docs/t...

Thanks again for pushing for clarity on this!

kiitos 17 days ago

Is there any way for a user using the "Login with Google ... for individuals" auth method (I guess auth method 1) -- to opt-out of, and prevent, their input prompts, and output responses, from being used as training data?
From an initial parse of your linked tos-privacy.md doc, it seems like the answer is "no" -- but that seems bonkers to me, so I hope I'm misreading or misunderstanding something!
ipsum2 18 days ago
I think you did a good job CYA on this, but what people were really looking for was a way to opt-out of Google collecting code, similar to the opt-out process for the IDE is available.
- dcreater 18 days ago
  
  Yeah how is opt out of data collection not an option? This is what they mean by don't be evil and Google is proving yet again that they truly are
  
  3 replies →
FL410 18 days ago

I appreciate the effort here, but I’m still confused. Is my $250/mo “ultra” plan considered personal and still something you train on?
yuuluuu 18 days ago

Thanks for clarification!
HenriNext 18 days ago
Thanks, one more clarification please. The heading of point #3 seems to mention Google Workspace: "3. Login with Google (for Workspace or Licensed Code Assist users)". But the text content only talks about Code Assist: "For users of Standard or Enterprise edition of Gemini Code Assist" ... Could you clarify whether point #3 applies with login via Google Workspace Business accounts?
- fhinkel 18 days ago
  
  Yes it does.

mil22 18 days ago

There is some information on this buried in configuration.md under "Usage Statistics". They claim:

*What we DON'T collect:*

- *Personally Identifiable Information (PII):* We do not collect any personal information, such as your name, email address, or API keys.

- *Prompt and Response Content:* We do not log the content of your prompts or the responses from the Gemini model.

- *File Content:* We do not log the content of any files that are read or written by the CLI.

https://github.com/google-gemini/gemini-cli/blob/0915bf7d677...

ipsum2 18 days ago
This is useful, and directly contradicts the terms and conditions for Gemini CLI (edit: if you use the personal account, then its governed under the Code Assist T&C). I wonder which one is true?
- fhinkel 18 days ago
  
  Thanks for pointing that out, we're working on clarifying!
  
  2 replies →
- mil22 18 days ago
  
  Where did you find the terms and conditions for Gemini CLI? In https://github.com/google-gemini/gemini-cli/blob/main/README..., I find only links to the T&Cs for the Gemini API, Gemini Code Assist (a different product?), and Vertex AI.
  
  1 reply →
- datameta 18 days ago
  
  Can a lawyer offer their civilian opinion as to which supercedes/governs?
jdironman 18 days ago
I wonder what the legal difference between "collect" and "log" is.
- kevindamm 18 days ago
  
  Collection means it gets sent to a server, logging implies (permanent or temporary) retention of that data. I tried finding a specific line or context in their privacy policy to link to but maybe someone else can help me provide a good reference. Logging is a form of collection but not everything collected is logged unless mentioned as such.

jart 18 days ago

Mozilla and Google provide an alternative called gemmafile which gives you an airgapped version of Gemini (which Google calls Gemma) that runs locally in a single file without any dependencies. https://huggingface.co/jartine/gemma-2-27b-it-llamafile It's been deployed into production by 32% of organizations: https://www.wiz.io/reports/the-state-of-ai-in-the-cloud-2025

ipsum2 18 days ago
There's nothing wrong with promoting your own projects, but its a little weird that you don't disclose that you're the creator.
- jart 18 days ago
  
  It would be more accurate to say I packaged it. llamafile is a project I did for Mozilla Builders where we compiled llama.cpp with cosmopolitan libc so that LLMs can be portable binaries. https://builders.mozilla.org/ Last year I concatenated the Gemma weights onto llamafile and called it gemmafile and it got hundreds of thousands of downloads. https://x.com/JustineTunney/status/1808165898743878108 I currently work at Google on Gemini improving TPU performance. The point is that if you want to run this stuff 100% locally, you can. Myself and others did a lot of work to make that possible.
  
  2 replies →
nicce 18 days ago
That is just Gemma model. Most people seek capabilities equivalent for Gemini 2.5 Pro if they want to do any kind of coding.
- jart 18 days ago
  
  Gemma 27b can write working code in dozens of programming languages. It can even translate between languages. It's obviously not as good as Gemini, which is the best LLM in the world, but Gemma is built from the same technology that powers Gemini and Gemma is impressively good for something that's only running locally on your CPU or GPU. It's a great choice for airgapped environments. Especially if you use old OSes like RHEL5.
  
  4 replies →

Workaccount2 18 days ago

This is just for free use (individuals), for standard and enterprise they don't use the data.

Which pretty much means if you are using it for free, they are using your data.

I don't see what is alarming about this, everyone else has either the same policy or no free usage. Hell the surprising this is that they still let free users opt-out...

thimabi 18 days ago
> everyone else has either the same policy or no free usage
That’s not true. ChatGPT, even in the free tier, allows users to opt out of data sharing.
- aargh_aargh 18 days ago
  
  This court decision begs to differ:
  https://www.reuters.com/business/media-telecom/openai-appeal...
  
  3 replies →
- joshuacc 18 days ago
  
  I believe they are talking about the OpenAI API, not ChatGPT.

mil22 18 days ago

They really need to provide some clarity on the terms around data retention and training, for users who access Gemini CLI free via sign-in to a personal Google account. It's not clear whether the Gemini Code Assist terms are relevant, or indeed which of the three sets of terms they link at the bottom of the README.md apply here.

fhinkel 18 days ago
Agree! We're working on it!
- fhinkel 18 days ago
  
  Hope this is helpful, just merged: https://github.com/google-gemini/gemini-cli/blob/main/docs/t...
  
  1 reply →

fastball 18 days ago

Kinda a tragedy of the commons situation. Everyone wants to use these tools that must be trained on more and more code to get better, but nobody wants it to be trained on their code. Bit silly imo.

FiberBundle 18 days ago

Do you honestly believe that the opt-out by Anthropic and Cursor means your code won't be used for training their models? Seems likely that they would rather just risk taking a massive fine for potentially solving software development than to let some competitor try it instead.

olejorgenb 18 days ago

> For API users, we automatically delete inputs and outputs on our backend within 30 days of receipt or generation, except when you and we have agreed otherwise (e.g. zero data retention agreement), if we need to retain them for longer to enforce our Usage Policy (UP), or comply with the law.
If this is due to compliance with law I wonder how they can make the zero-data-retention agreement work... The companies I've seen have this have not mention that they themself retain the data...
rudedogg 18 days ago
Yes.
The resulting class-action lawsuit would bankrupt the company, along with the reputation damage, and fines.
- pera 18 days ago
  
  > Anthropic cut up millions of used books to train Claude — and downloaded over 7 million pirated ones too, a judge said
  https://www.businessinsider.com/anthropic-cut-pirated-millio...
  It doesn't look like they care at all about the law though
  
  2 replies →

rudedogg 18 days ago

Insane to me there isn’t even an asterisk in the blog post about this. The data collection is so over the top I don’t think users suspect it because it’s just absurd. For instance Gemini Pro chats are trained on too.

If this is legal, it shouldn’t be.

Buttons840 18 days ago

Is this significantly different than what we agree to when we put code on GitHub?

BryanLegend 18 days ago

Some of my code is so bad I'm sure it will damage their models!

predkambrij 18 days ago

seems to be straight "yes" with no opt-out. https://github.com/google-gemini/gemini-cli/blob/main/docs/t...

fnl 18 days ago

How does that compare to Claude Code? How protected are you when using CC?

nojito 18 days ago

>If you use this, all of your code data will be sent to Google.

Not if you pay for it.

reaperducer 18 days ago
>If you use this, all of your code data will be sent to Google.
Not if you pay for it.
Today.
In six months, a "Terms of Service Update" e-mail will go out to an address that is not monitored by anyone.
- mpalmer 18 days ago
  
  This sort of facile cynicism doesn't contribute anything useful. Anyone can predict a catastrophe.
  
  1 reply →
- nojito 18 days ago
  
  Sure but then you can stop paying.
  There's also zero chance they will risk paying customers by changing this policy.

nprateem 18 days ago

I'm not that bothered. Most of it came from Google or anthropic anyway

learner007 17 days ago

good to know, I won't be using this. Curious, do you know if OpenAI codex and Claude also do the same? I was under the impression that they don't share code.

naiv 18 days ago

Who cares, software has no value anymore.

crat3r 18 days ago

Sarcasm? Weird statement if not.
I still have yet to replace a single application with an LLM, except for (ironically?) Google search.
I still use all the same applications as part of my dev work/stack as I did in the early 2020's. The only difference is occasionally using an LLM baked into to one of them but the reality is I don't do that much.