← Back to context

Comment by matthewdgreen

13 hours ago

I was retained as an expert witness in some of the cases involving Google, so of course I’m aware that Google keeps logs. (In general on HN I’ve found it’s always helpful to assume the person you’re arguing with might be a domain expert on the topic you’re arguing about; it’s saved me some time in the past.)

But Google’s internal logging is not the question I’m asking. I’m saying: can you find a single criminal case in the literature where police caused Google to disgorge a complete browsing history on someone who took even modest steps not to record it (ie browsed logged out.) Other than keyword search warrants, there doesn’t seem to be much. This really surprised me, since as an expert I “know” that Google has enough internal data to reconstruct this information. Yet from the outside — the experience that matters to people - they’ve managed to operate a product where real-world privacy expectations have been pretty high if you take even modest steps. I think this is where we get many of our privacy expectations from: the actual real-world lived expectations of privacy are much closer to what we want than what’s theoretically possible, or what will be possible in a future LLM-enabled surveilled world.

> [...] can you find a single criminal case in the literature where police caused Google to disgorge a complete browsing history on someone who took even modest steps not to record it (ie browsed logged out.).

Can you first point to me making a claim that would require such a case? Or can you, alternatively, point to why there is a need for change rather than just continue to apply the same level of legal protections to LLM service providers?

The fact that this started with a report about a users ChatGPT account and you felt the need to move us towards people using commercially hosted LLMs without an account (cause getting five queries in before OpenAI forces you to sign in is a realistic use case) I let slide up to this point, because whether we are talking about incognito mode access or a user with an account, doesn't change that no one here says why using Chat.com is different to Google.com. I just wanted to call it into memory, cause it is not very expert like, same with the (mis)quoting.

To make it simple, this is my question, the only thing I'd like to have answered:

When self hosted websites first became a thing, governments across the globe did not write new editorial legislation specific for these. They did just apply what was already established for print media/speech.

In this context, why should LLM input be handled differently to data hosted online?

Use file sharing services with no login requirement and the legal requirements there if this completely irrelevant red herring is absolutely necessary for you. Doesn't change anything about the question.