Privacy Pass Authentication for Kagi Search

1 year ago (blog.kagi.com)

358 comments

b3n

I love that Kagi now uses Privacy Pass, and they look like a cool company in general.

That being said, they essentially took the IETF draft I worked on for a while [1] and also my Rust implementation [2]. They built a thin wrapper [3] around my implementation and now call it "Kagi’s implementation of Privacy Pass". I think giving me some credit would have been in order. IETF work and work on open-source software is mostly voluntary, unpaid, and often happens outside of working hours. It's not motivating to be treated like that. Kagi, you can do better.

[1] https://datatracker.ietf.org/doc/draft-ietf-privacypass-batc... [2] https://github.com/raphaelrobert/privacypass [3] https://github.com/kagisearch/privacypass-lib/blob/e4d6b354d...

alphabetter 1 year ago
Honestly, I think what TFA calls "Kagi’s implementation of Privacy Pass" is the integration of the feature into their server and clients, not the RFC (which they acknowledge), or the protocol implementation.
- abound 1 year ago
  
  [I work at Kagi]
  Indeed, this is the intended interpretation of "Kagi's implementation of Privacy Pass" - we're talking about building out the server infrastructure, the UX, the browser extensions, the mobile applications, the Orion browser integration, the support and documentation, the Tor service, etc. The cryptography is obviously an extremely important piece, but it is far from the only piece.
  As other commenters have noted, the code in question is MIT licensed [1] and we're pulling it in as a standard dependency [2], it's not like we've gone out of our way to obscure its origin. The MIT license does not require us to do anything more.
  That said, I can understand the author wanting more visible attribution, and that's very reasonable, we'll add a blurb to the blog post acknowledging his contribution to Kagi's deployment of Privacy Pass.
  [1] https://github.com/raphaelrobert/privacypass/blob/main/LICEN...
  [2] https://github.com/kagisearch/privacypass-lib/blob/e4d6b354d...
  
  2 replies →
SamuelAdams 1 year ago
So if they add “credit to raphaelrobert”, or a copy of your license to their code somewhere, Kagi will be compliant?
I’ve never had any of my open source software used, and I typically license it with MIT, so I’m curious how other groups and organizations actually comply with the license.
- literallyroy 1 year ago
  
  They are compliant, the code being used is under the MIT license.
  
  12 replies →
graypegg 1 year ago

Captured 14 Feb 2025 ~12:15pm EST from README header
> This repository contains the source code of the core library implementing the Privacy Pass API used by Kagi.
Yeah... that doesn't feel great. Though I do think the folks at Kagi would be open to more accurately reframing that as "core library implementing a Crystal Lang wrapper for raphaelrobert/privacypass". It's likely unintentional, they were probably just focusing on getting it working and didn't get someone to reread this stuff.

drdaeman 1 year ago

Neat! It's rare to see that a service you use actually does something that benefits the user rather that itself. An unexpected, but a really pleasant surprise.

I wish this extension would integrate better with the browser by automatically understanding the context. That is, if I'm in a "regular" mode it'll use my session, but if I'm in a "private browsing" mode (`browser.extension.inIncognitoContext`) it'll use Privacy Pass to authenticate me, without me having to explicitly do anything about it.

(I don't use Orion, as there's no GNU/Linux version.)

_fat_santa 1 year ago
> It's rare to see that a service you use actually does something that benefits the user rather that itself
The reason it's become so rare is most companies in this space (heck tons of tech companies period) have used a business model of offering a thing to one group of users and then turning around and selling the results of that thing to another group of users, where the latter group is the one actually driving your revenue. This by default almost assumes a hostility towards the former group because their interests will of course be at odds with the interests of the latter group.
What's refreshing about Kagi and other new tech companies is they have dumped this model in favor of having just one group that they serve and drive revenue from (ie. the 'old' model).
- sxg 1 year ago
  
  The other part to this is that the internet accelerates network-effects, which you can further supercharge by making your product as cheap as possible or free to the former group in your example.
  It’s hard to make money by charging a lot to a small group of people since now you’re dealing with anti-network effects. Doubling the price of a product will likely more than halve your user base.
  
  32 replies →
- numbsafari 1 year ago
  
  > This by default almost assumes a hostility towards the former group because their interests will of course be at odds with the interests of the latter group.
  I would generally agree that that's the "default".
  However, there are cases where two sides of a market need an intermediary with which they can both independently transact, and a net benefit of that interaction is felt on both sides. The key is to construct the solution such that the intermediary depends on the goodwill of both sides of the market.
  I think Kagi is somewhat flipping the script. By "taking" data from publishers for free, they are then selling it to readers at a cost. However, there is a trade off. Kagi needs to make sure publishers continue to make their content available so that it can be searchable, or used in their Assistant product. In order to do that, they need to do the opposite of what Google is doing by trying to sequester traffic on Google.com: Kagi's best interest is to make sure that they provide good value to both sides.
  Indeed, using the Assistant product, the way it is structured, I very often find myself clicking through to the referenced original sources and not just consuming the summarized content.
  How this evolves over time, from a product design standpoint, will be interesting to watch.
- ulrikrasmussen 1 year ago
  
  Kagi user here. I agree!
  The main driver of hostility to users is due to ad-based business models. I think we would see a much more healthy internet if we had regulation which prohibited companies from choosing ads based on any information associated with the user that the ad is shown to. That is, any data collected in the past and any data associated with the session and request must not be taken into account when choosing the ad; two requests by different users in different locations should have the exact same ad probability distributions.
  I know we are never getting this because it would kill or severely harm the business models of some of the most profitable businesses in the world.
- basch 1 year ago
  
  They would be a good steward of pinboard.in if it were for sale / recovery.
- brookst 1 year ago
  
  Direct monetization FTW. Charge people for value. Cultivate audiences willing to pay for value.
  Incentives aligned. Happy customers. Good businesses. Maybe you only get 60% gross margins, or, gasp, 40% gross margins. But so much less toxic.
freediver 1 year ago
> (I don't use Orion, as there's no GNU/Linux version.)
We commenced work on Orion for Linux yesterday.
- hurutparittya 1 year ago
  
  Any target date for open-sourcing it? :^)
- joshuaturner 1 year ago
  
  I remember the announcement for Orion but I haven't followed closely at all - any support for container proxies like in Firefox? Can't lose that feature
  
  5 replies →
- WD-42 1 year ago
  
  Amazing!!!
Klaus23 1 year ago
The downside of this is that if you are not on a larger network, the IP address will probably deanonymise you. Kagi knows you are logged in, and if you open a private browsing window to do a spicy search, they could link the searches. Fast switching between modes is undesirable.
- aryonoco 1 year ago
  
  And that's why Kagi has simultaneously rolled out their service availability on tor: http://kagi2pv5bdcxxqla5itjzje2cgdccuwept5ub6patvmvn3qgmgjd6...
  Tor has its flaws and criticisms, but it's really not on Kagi to fix them. With the combination of tor and their privacy pass, Kagi has gone further in allowing their paid users access to their services than anyone else.
  Disclaimer: Not associated with Kagi in anyway other than being a very happy user.
  
  3 replies →
theschmed 1 year ago

FYI in case you’re not aware, they announced in a podcast near the end of 2024 that a Linux version of Orion is planned.
paradox460 1 year ago
With kagi you'll get used to them making the correct choice. It's been stunning how they haven't really had any missteps
I wish my kagi t-shit could say the same. Bottom hem unraveled on the second wash, and so it's been consigned to the sleep and yard work shirts. They issued me a coupon for a free shirt as replacement, but it's yet to ship
- cootsnuck 1 year ago
  
  I think I can finally buy into the Kagi hype now that I've found a sincere negative opinion.
  
  2 replies →
thibaultmol 1 year ago

yeah, same. I would only use privacy pass for icognito searches COUGH P0RN COUGH mainly (let's be honest). Feel free to submit the idea on kagifeedback.org

mhitza 1 year ago

The post hints at this, but having a shop where one can buy a privacy pass without an account makes sense.

Should support some crypto currency (probably monero), and something like GNU Taler if that technology ever becomes usable.

jacekm 1 year ago
Kagi accepts bitcoins but Vlad (the founder) mentioned on their forum that so few people use this option that it does not make sense to work on accepting Monero.
- freediver 1 year ago
  
  (vlad here) Rather, we are opportunistic about it and we want to focus on things that make impact (which most of the time is search, not billing). If there is enough demand, we will work on Monero support - and yes I agree, buying privacy pass tokens, without even needing an account, is one of those super-cool use cases.
  
  14 replies →
- akimbostrawman 1 year ago
  
  Nobody wants to use BTC because of high fees and at this point its less a usable exchange of value than speculative asset. I personally would only ever use and trust a online service advertised as private/anonymous if it actually supported a private and anonymous currency (like some vpns do).
  
  4 replies →
- mhitza 1 year ago
  
  Kagi's privacy guarantee is more of a "trust me bro" and I say that as a Kagi subscriber. While they may claim that they preserve privacy or anonimity as long as it's tied to a user account, or payment information nothing prevents them from associating searches with user. Even protonmail enabled logging for a particular user at one point. Their guarantee is on the same level.
  At the same time, privacy pass is a very foreign concept to me. If they are transferable between devices, one could generate a couple and resell them over some other medium (even in person).
  
  3 replies →
autoexec 1 year ago
I agree that third party stores selling tokens without any account at all would be the ideal solution, but without an account you'd be missing out on many of the features that make kagi worth using like being able to remove certain domains from results or prioritizing types of results over others.
- dsp_person 1 year ago
  
  Add the ability to export your account config (yaml?) and use it with privacy pass. Maybe even sync it with git.
  To avoid fingerprinting by config, have a page where the community can share and vote on best configs, then clone and use a popular one that suits your needs.
  
  4 replies →

Eji1700 1 year ago

So....is this privacy through assumed lack of logging? Not trying to be a dick, just legit don't understand a part of this.

User A asks kagi for tokens. Kagi says "sure, here's 500 tokens". If kagi then logs the 500 tokens it just gave to user A, it now will know if any of those tokens is redeemed at a later date, that they're assigned to user A?

Of course if Kagi just doesn't retain this data, then yeah all is good because the token itself is only marked as valid, not valid and given to user A on date Y, but....that's it right? Or am I misunderstanding something?

readyplayeremma 1 year ago
The server does not generate the tokens, the client generates the tokens. The server is supposed to be able to verify that they were generated by a client who was granted the authority to generate them, but not which client did so. At least, not without side-channel information.
> The main building block of our construction is a verifiable oblivious pseudorandom function (VOPRF)
I am not sure how well tested that primitive is, but it definitely appears to be more than the server handing clients tokens and then pretending not to know who it gave them to.
The referenced paper: https://petsymposium.org/popets/2018/popets-2018-0026.pdf
- rajnathani 1 year ago
  
  What I find confusing is: Given that the server is essentially authorizing each subsequent client request (eg: For Kagi: search queries after a Kagi user has already been authenticated) in a way whereby the client is anonymous, what is the difference between Privacy Pass and simply providing a common authorization token to each user (and thus skipping all this cryptography)?
  Update: On some thought, for the approach of the server providing a common authorization token that there is no guarantee to the client that the server is actually providing a common token and thus not just simply providing a unique identifier to each user. Thus, the Privacy Pass's cryptography ensures that the client knows that it is still anonymous. Update 2: But, what guarantee exists that the server doesn't generate a unique public key (i.e. public-private key pair) for each user and thus defeat anonymity this way? Update 3: They use zero-knowledge proofs to prove that all tokens are signed by the same private-key, from their paper: "The work of Jarecki et al. [18] uses a non-interactive zero-knowledge (NIZK) proof of discrete log equality (DLEQ) to provide verification of the OPRF result to the user. Their construction is hence a ‘verifiable’ OPRF or VOPRF and is proven secure in the random-oracle model. We adapt their construction slightly to use a ‘batch’ DLEQ proof allowing for much more efficient verification; in short this allows a user to verify a single NIZK proof that states that all of their tokens are signed by the same private key. This prevents the edge from using different key pairs for different users in an attempt to launch a deanonymization attack; we give more details in Section 3.2.".
- dave1010uk 1 year ago
  
  This is my understanding of how it works, without knowing the actual maths behind the functions:
  # client r = random_blinding_factor() x = client_secret_input() x_blinded = blind(x, r) # Server y_blinded = OPRF(k, x_blinded) # Client y = unblind(y_blinded, r)
  So you end up with y = OPRF(k, x). But the server never saw x and the client never saw k.
  This feels like the same kind of unintuitive cryptography as homomorphic encryption.
- potamic 1 year ago
  
  Using a client provided by Kagi can be a side-channel then? Should we rather be using an independent, standard client?
ghayes 1 year ago
Privacy Pass docs [0] cover this, but it is mostly referenced deeper in the paper. I believe the idea is that the tokens returned by the server are "unlinkable" to the (modified) tokens passed back by the client. So the server knows it passed back tokens A, B and C to some users, and later receives tokens X, Y and Z. It knows that X, Y and Z are valid, but not their correspondance to the tokens it issued. It uses elliptic curve cryptography for this.
[0] https://privacypass.github.io/
- codethief 1 year ago
  
  After reading your comment I still didn't quite understand how the server couldn't just simply log the tokens A, B, C issued to user X. So I had a look at the website you linked: IIUC the key is that the tokens are actually generated by the user and the server never sees them (unblinded) before their first usage:
  > When an internet challenge is solved correctly by a user, Privacy Pass will generate a number of random nonces that will be used as tokens. These tokens will be cryptographically blinded and then sent to the challenge provider. If the solution is valid, the provider will sign the blinded tokens and return them to the client. Privacy Pass will unblind the tokens and store them for future use.
  > Privacy Pass will detect when an internet challenge is required in the future for the same provider. In these cases, an unblinded, signed token will be embedded into a privacy pass that will be sent to the challenge provider. The provider will verify the signature on the unblinded token, if this check passes the challenge will not be invoked.
  
  1 reply →
- Eji1700 1 year ago
  
  Ah thank you. That's the part I was missing. I know this example is wrong in 100 different ways, but something like "yeah we know a key with prime factor X is valid, and this has one", but there's thousands of those out there, so it can't tie out to whom.
Sakos 1 year ago

The idea is that the tokens aren't linked to any account. They're anonymous.
adamtaylor_13 1 year ago

Yeah I think you don’t understand the premise behind privacy pass tokens.
The whole idea is that the server does not which WHICH client a token belongs to. It doesn’t generate the tokens.

outime 1 year ago

The biggest flaw I always saw in Kagi has now been addressed by this. Thank you for listening and working to make the product appealing to (almost) everyone!

MostlyStable 1 year ago

One of the biggest complaints about Kagi from people who have not yet adopted it is their privacy concerns around having to login and have payment information.

I'm not one of the people that has been concerned about that, but I'm curious to what extent this alleviates those concerns among those that have had them.

g-b-r 1 year ago
> I'm not one of the people that has been concerned about that, but I'm curious to what extent this alleviates those concerns among those that have had them.
I am, it's mind-blowing to me that anyone would login to a search engine (yes, I know how many do it, now).
After a brief verification of the system, I'm pretty sure I'll sign up, now
- ericrallen 1 year ago
  
  Logging in to a search engine weirded me out at first, but after about a week I was so pleased with the results that I’ve been happily paying for almost a year now.
  I honestly feel like any major free search engine is probably doing more to try to track you anyway.
  And if you’re going to search something you want to be anonymous, you can just like use another search engine. I honestly haven’t run into the situation where I needed to.
  I do worry that some day someone will be able to see how often I forget basic syntax for some JavaScript or Python method - or how often I can’t be bothered to type out a full domain and just search to navigate to it - but that’s a price I’m also willing to pay.
- mvieira38 1 year ago
  
  Most people are riding 24/7 with a Google session active, as it carries from Youtube/Chrome to Search. I don't think many realize it
- xigoi 1 year ago
  
  Why would you not want to login to a personalized service (unless you really need to be anonymous for some reason)?
faeranne 1 year ago
Assuming the cryptography does what they say it does (am not a cryptography expert, so I can't verify that part), this would completely disjoin a search request from any account info. The account generates several "search tokens", and for each search request, one of those tokens is spent. The tokens are generated on-device, and until spent, never leave the device, so in theory there's no way for Kagi to know which account generated the token just from the token alone. This doesn't fix fingerprinting or IP associations (though the plugin for Firefox and Chrome supposedly takes efforts to try and limit fingerprinting too), but this isn't any better/worse than simply using Google or Duckduckgo, and functions on Tor if you really want some privacy.
Again, not sure on how the tokens are proven legit without ever sharing them, but there's probably some ~~zero-knowledge proof~~ stuff going on that covers that.
Edit: Not zero-knowledge proof. Seems to be Blind Signature?
- sedatk 1 year ago
  
  > This doesn't fix fingerprinting or IP associations
  It solves the problem of using a paid service without compromising customer’s privacy which is a breakthrough. The rest are different problems and they are universal issues with various existing solutions as you already pointed out.
- sanbor 1 year ago
  
  Most of the time I have ProtonVPN in my phone and computer, which solves the IP association problem for me

cobertos 1 year ago

What's to stop someone on the Kagi side from just adding a new column to the token table that has the user (with their SessionCookie) who generated the token next to it? I don't see how this can't be trivially connected to the original token generator.

fvirdia 1 year ago
Implementor here. During the Privacy Pass "issuance" protocol, the client will generate a "message" that the server will process. The output from the server is returned to the client, that further modifies this output to produce the final tokens. The last client modification randomises these tokens in such a way that the server will be unable to identify to what issuance they belong.
The very cool thing is that this is the case even if the server tries to misbehave during their phase. This means that users only need to trust the client software, which we open sourced: https://github.com/kagisearch/privacypass-extension
Some posters are mentioning blind signatures, and indeed Privacy Pass can utilise these as a building block. To be precise, however, I should mention that for Kagi we use "Privately Verifiable Tokens" (https://www.rfc-editor.org/rfc/rfc9578.html#name-issuance-pr...) based on "oblivious pseudorandom functions" (OPRFs), which in my personal view are even cooler than blind signatures
- hansvm 1 year ago
  
  If you can get Kagi to agree to it, definitely write a blog post on their behalf, please.
  
  2 replies →
perihelions 1 year ago
That's apparently explained in their citation [1], the paper about cryptographically anonymous token protocols. It's not a simple plaintext token.
https://news.ycombinator.com/item?id=19623110 ("Privacy Pass (cloudflare.com)", 53 comments)
- promiseofbeans 1 year ago
  
  Yep, in fact Cloudflare are the original people who came up with this, when people were complaining about seeing turnstile screens too often
ajayyy 1 year ago
The tokens are "generated" on the client, and the server just gives the client enough information to make that locally generated token become "valid", without being able to link that token to a specific validation attempt
- sebazzz 1 year ago
  
  So basically the server signs the token and afterwards the server can verify its own signature for every request with that token?
  
  1 reply →
SomeoneOnTheWeb 1 year ago
Exactly the question I had in mind. You can't rely on server side trust so I'm curious if I just misunderstood something...
- thibaultmol 1 year ago
  
  I think the extension they're using being open source helps with this? because it can be checked in there? not sure
lxgr 1 year ago

I believe "Privacy Pass" uses blind signatures, so the token that the TokenResponse contains can't be correlated to the one provided in the search query, if I understand it correctly.

grg0 1 year ago

Does this actually work, though? The token can only be redeemed once, which means that, realistically, the client is going to be in a loop generating and redeeming tokens in a given search session, which makes the pairs trivial to correlate. The article even states it:

> For this reason, it is highly recommended to separate token generation and redemption in time, or “in space” (by using an anonymizing service such as Tor when redeeming tokens, see below).

Sure, Tor will random the space. But what about the time? I then went to "see below" and didn't see anything relevant. Or is the idea that, with sufficient request volume, clients mask each other in time?

Also, Tor will only randomize the space insofar as you keep re-establishing a session; the loop remains static for the duration of a session afaik. And re-establishing a session takes like 10 seconds. So is it really randomizing the space?

abound 1 year ago
> The token can only be redeemed once, which means that, realistically, the client is going to be in a loop generating and redeeming tokens in a given search session, which makes the pairs trivial to correlate.
One token request can produce N tokens. We have it configured where N = 500, so most users will be requesting more tokens fairly infrequently.
- grg0 1 year ago
  
  Thanks for the clarification.

tonygiorgio 1 year ago

This is sick, fantastic work.

I have built blind signature authentication stuff before (similar to privacy pass) and one thing I’m curious about is how you (will) handle multi device access?

I understand you probably launched with only unlimited search users in order to mitigate the same user losing access to their tokens on a different device. But any ideas for long term plans here? When I built these systems in the past, I always had to couple it with E2EE sync. Not only can that be a pain for end users, but you can also start to correlate storage updates with blind search requests.

Either case, this is amazing and I’m gonna be even more excited to not just trust Kagi, but verify that I don’t need to trust y’all. Congrats.

fvirdia 1 year ago

Yes, multi-device is definitely not easy. We've played with a few ideas, but it is definitely not a question with an obvious answer. For now, our rate-limiting allows you to use Privacy Pass on a few different devices by having each generate tokens independently. We will see how this goes and listen to user feedback before going back to the drawing board.

Ajedi32 1 year ago

Is this the same Privacy Pass that Cloudflare was using to allow clients to bypass CAPTCHAs? If so, this is a really neat application of that system; it never occurred to me that it could be used to anonymously authenticate to a paid service.

RupertWiser 1 year ago

The cryptography privacy pass is based off [1] actually comes from Ecash[2] so we’ve gone full circle.
[1] https://www.petsymposium.org/2018/files/papers/issue3/popets... [2] https://en.m.wikipedia.org/wiki/Ecash
jeroenhd 1 year ago

Yes, though Cloudflare has ended their privacy pass trial as far as I know.
I remember Safari as the only browser that implemented it natively, but I guess Orion has it now too.

noident 1 year ago

I'm not affiliated with the Tor Project organization, but I have some questions.

From Tor docs [0]:

> Add-ons, extensions, and plugins are components that can be added to web browsers to give them new features. Tor Browser comes with one add-on installed: NoScript. You should not install any additional add-ons on Tor Browser because that can compromise some of its privacy features.

How does Kagi square this with Privacy Pass, which requires a browser extension rejected by Tor [1]? Did Kagi analyze whether it is possible to bucket users of Tor into two distinct groups depending on whether the extension is installed? Do I need to trust another organization other than the Tor project to keep the signing keys for the extension safe? Was there any outreach to the Tor community at all prior to releasing this feature?

It's great that they're Torifying the service, but depending on a 3rd party extension is not ideal.

[0] https://support.torproject.org/glossary/add-on-extension-or-...

[1] https://gitlab.torproject.org/tpo/applications/tor-browser/-...

noident 1 year ago
I sat down on my desktop to take a closer look at how Kagi implemented this. It turns out that the privacy pass extension isn't the one implemented by CloudFlare (and rejected by Tor), but a new extension called Kagi Privacy Pass.
Ok, let's look at the source.
curl -L https://addons.mozilla.org/firefox/downloads/file/4436183/kagi_privacy_pass-1.0.2.xpi > /tmp/extension.xpi unzip /tmp/extension.xpi -d /tmp/extension cd /tmp/extension
Alright, here's some nice, clean, easy-to-read Javascript. Nice! Wait, what's that?
// ./scripts/privacypass.js /* * Privacy Pass protocol implementation */ import init, * as kagippjs from "./kagippjs/kagippjs.js"; ... // load WASM for Privacy Pass core library await init();
I opened ./kagippjs/kagippjs.js and was, of course, greeted with a WASM binary.
I personally would not install unknown WASM blobs in Tor browser. Source and reproducible build, please!
Let's continue.
// get WWW-Authenticate HTTP header value let origin_wwwa_value = ""; const endpoint = onion ? ONION_WWWA_ENDPOINT : WWWA_ENDPOINT; try { const resp = await fetch(endpoint, { method: "GET", headers: { 'X-Kagi-PrivacyPass-Client': 'true' } }); origin_wwwa_value = resp.headers.get("WWW-Authenticate"); } catch (ex) { if (onion) { // this will signal that WWWA could not fetch via .onion // the extension will then try normally. // if the failure is due to not being on Tor, this is the right path // if the failure is due to being on Tor but offline, then trying to fetch from kagi.com // won't deanonymise anyway, and will result in the "are you online?" error message, also the right path return origin_wwwa_value; } throw FETCH_FAILED_ERROR; }
What?? If the Onion isn't reachable, you make a request to the clearnet site? That will, in fact, deanonymize you (although I don't know if Tor browser will Torify `fetch` calls made in extensions). You don't want Tor browser making clearnet requests just because it couldn't reach the .onion! What if the request times out while it's bouncing between the 6 relays in the onion circuit? Happens all the time.
- abound 1 year ago
  
  [I work at Kagi]
  The extension is open-source [1], including the Rust code that produces the WASM [2]. You should be able to produce a bit-compatible binary from these repos, and if not, please file a bug!
  [1] https://github.com/kagisearch/privacypass-extension
  [2] https://github.com/kagisearch/privacypass-lib/
  
  1 reply →
JumpCrisscross 1 year ago
> Was there any outreach to the Tor community at all prior to releasing this feature?
Do we know what fraction of Kagi users access it through Tor?
- noident 1 year ago
  
  It must be a small fraction since they released their Tor onion service 3 hours ago in the original linked article :)
  
  1 reply →

esafak 1 year ago

I love this company and product. I noticed another great feature today: the ability to filter AI slop in image search! It's the right-most filter: "AI Images".

Klaus23 1 year ago

If account settings are not possible because you could fingerprint users, then client-side filtering or reordering might be a solution.

Safe-search or not, just transfer both result lists and make the client only show the one you want. The same could be done with languages, where you at least get the results for the bigger ones. Blacklists would hide your blocked crap sites. It may even be possible to implement the ranking adjustments to some extend.

Client-side filtering would put more load on the server and search sources, but I hope the cost increase is tolerable. Blacklisting and reordering could be virtually free. This could make Privacy Pass available to many more users who don't have overly complex account rules.

xzjis 1 year ago

That's really smart. You should suggest it to Kagi directly: https://kagifeedback.org/

echoangle 1 year ago

I don’t really understand how the protocol can ensure that the server can’t identify the client.

As far as I understand, the client sends some information A to the server, the server applies some private key X and returns the output B to the client, which then generates tokens C from the output.

If the server uses a different X for every user and then when verifying just checks the X of every user to see which one is valid, couldn’t the server know who created the token?

jerf 1 year ago
Here's a resource I found that walks through the ideas of the protocol, starting with simple implementations that have a problem, and then solving the problem one by one: https://privacypass.github.io/protocol/
I think that's the best conceptual overview of a crypto protocol I've ever seen.
- dan353hehe 1 year ago
  
  That is an excellent explanation of how the protocol works. Thank you for bringing it to the discussion!
- kayson 1 year ago
  
  I really love this style of explanation. There was another one I saw recently (OIDC, I think?) that I wish I'd bookmarked but I forgot to
stebalien 1 year ago
See section 5.5 of the linked paper https://petsymposium.org/popets/2018/popets-2018-0026.php. I'm not sure if/how Kagi implemented this, but the idea is that Kagi's "public" component can be committed to publicly (e.g., in the browser extension itself).
- abound 1 year ago
  
  [I implemented this at Kagi]
  And you can validate this, if you try to issue a Privacy Pass search without a private token, you'll get a `WWW-Authenticate` header that kicks off the handshake, and that should be the same for all users for a given epoch (month). E.g.
  curl -v -H 'X-Kagi-PrivacyPass-Client: true' 'https://kagi.com/search?q=test'
  
  6 replies →
- echoangle 1 year ago
  
  Thanks for looking it up, that makes sense.
wasabi991011 1 year ago
In the simplest terms, the token generation process B->C is done with the user's private key. So even if the server knows A,X,B they can't link it to the token C.
- echoangle 1 year ago
  
  But if the server is allowed to vary X, it can basically act like different servers to each client, and can then when given a token check for which server would have been valid. The solution I got from the other replies is to make sure that the server uses the same X for everyone by verifying it as a client.
  
  1 reply →

AutistiCoder 1 year ago

Trying to understand Privacy Pass here.

My understanding is, it's analogous to writing a note to your manager.

That note is a random number written in ink your manager can't actually read; all they can do with that note is sign it. They ask God (used here to represent math itself) how to sign this note, and God gives them a unique signature that also theoretically cannot be used to calculate the number that's written. This signature also proves what you're authorized to do. And then your manager hands the note back to you.

The note's sole function past that point is so you can point to the signature thereon and say "this signature proves I can do this, that, etc."

ripped_britches 1 year ago
> They ask God (used here to represent math itself)
Thank you so much, I am 100% stealing this
- AutistiCoder 1 year ago
  
  I mean, invoking God kind of works for describing the idea that “math” sometimes knows a secret and can tell someone how to act on that secret without telling them the secret.

endorphine 1 year ago

Will the extension eventually be made available for Firefox on Android? Right now the Firefox extension link says that it's not compatible.

P. S: I don't use the Kagi app in Android.

thibaultmol 1 year ago

Not yet, but seems like I think they probably could do. Request it on https://kagifeedback.org/
freediver 1 year ago
Yes, should happen soon.
- privacyking 1 year ago
  
  How soon?

eatyourglory 1 year ago

I have been a Kagi subscriber for a while now, but this new addition finally convinced me to start using Kagi in incognito mode! Thank you very much for adding this!

nvarsj 1 year ago

I still cannot get iOS to reliably use Kagi as my default search engine. I've tried the extension, etc. but nothing works reliably.

It's madness - how is it market fairness when iOS literally forces you to use Google? I know Google is paying Apple to do exactly that, but it's so beyond anti-consumer I can't believe it.

Sakos 1 year ago

Meanwhile, Google gets hit with the anti-competitive judgment and Apple gets off by way of being more anti-competitive. Wild, isn't it?
TingPing 1 year ago

Anecdotally I’ve had zero issues with the Safari extension.

ulrikrasmussen 1 year ago

That's a cool idea! Seeing the screenshot I almost immediately figured this would be related to Chaum's digital cash and blind signatures, and it seems to be cited in the linked paper. I had thought of using blind signatures for anonymous authorization, but I was not aware that there was an actual design for that application.

I think government issued digital identities should also use this.

bpev 1 year ago

Amazing! I used kagi at the very beginning, but my biggest concern was always this (that logging in is inherently less private than using something like duckduckgo, so I'm just forced to trust Kagi's will). This is the kind of thing that will force me to take a second look again.

eudhxhdhsb32 1 year ago

This is exactly what I've been waiting for to try Kagi.

I want better search results and willing to pay for it, but not at the cost of linking all my searches to my identity.

Also happy to see they're adding tor support.

I feel like I might hit the default limit of 2000 searches per month, but it's not far off.

pavon 1 year ago

This is very cool. I'm curious about why there is a limit on the number of tokens generated per month, when this is only currently offered to unlimited accounts. Since the tokens all expire at the end of the month, tokens can't be horded to use Kagi after a subscription ends. Perhaps it is instead a resource issue where token generation is expensive. In that case though, I would think limiting tokens/day would be more appropriate - there is already going to be a spike to generate new tokens on the first of the month, so if the server can handle that they can handle some users generating a batch of tokens each day.

This is not intended as criticism, just inquisitive.

abound 1 year ago

[I worked on building this at Kagi]
Since we have no idea who is issuing search requests in Privacy Pass mode, if there was no limits on token issuance, you could simply generate infinite tokens and give them out (or use them as part of some downstream service), and we'd have no other recourse for rate-limiting to prevent abuse.
Setting a high, but reasonable limit on issuance helps prevent abuse, and if you run out of tokens, you can reach out to support@kagi.com and we'll reset your quota.
mortar 1 year ago
The reason they give in their docs is to “prevent abuse” (https://help.kagi.com/kagi/privacy/privacy-pass.html).
It feels like they picked a number no user should hit, while keeping it low enough to not pass Kagi out “free” to all their friends.
- pavon 1 year ago
  
  Ah, that makes sense. It would be harder to detect sharing with this system than with account sharing. My thoughts went in a completely different direction when I read "abuse" the first time.

daft_pink 1 year ago

Hope they can enable this in Safari so that I can use iCloud Private Relay with it.

ThePowerOfFuet 1 year ago
>Hope they can enable this in Safari so that I can use iCloud Private Relay with it.
What are you hoping to gain with that?
- daft_pink 1 year ago
  
  Being able to use this feature while hiding my ip while I browse. It’s not that I want to hide my ip from Kagi it’s more that it’s not convenient to use Kagi for search on chrome while browsing in safari

perdomon 1 year ago

Can someone make a case for Kagi? I'm using Google + Claude for all my websearch needs. I don't feel like there's a gap there, but maybe that's because I've never experienced anything better and can't imagine it?

I do value privacy, but I wouldn't pay extra for more private search results. I might pay extra for __better__ search results, but that's hard to measure.

Just curious if anyone has had a legitimately great experience with this product and can communicate its benefits. Bonus points if you're in software dev.

mayneack 1 year ago
For me the killer kagi feature is that I can manually up and down weight domains in results. I can "pin" and always get wikipedia results or "block" and never get pintrest or raise or lower as needed. Brave search has a similar feature, but they only seem to support "block" not "lower", which is what I use a lot more often.
https://kagi.com/stats?stat=leaderboard
- perdomon 1 year ago
  
  This is the exact sort of QOL feature that could convert me. I get tired of seeing the same recycled AI listicles for certain search genres, and it sounds like Kagi could help me manage that for my preferences.
- stephen_cagle 1 year ago
  
  Ditto, for me is that I can take the damn geekforgeeks and facebook to "lower". Raise the python official docs since that is most of the time what I want.
esafak 1 year ago
If you don't immediately notice the difference between Kagi's and Google's results, Kagi is not for you.
- trvr 1 year ago
  
  This is so true. I've been on Kagi since March 2024. On the occasion that I find myself on Google on someone else's computer, I find it completely unusable from a search result perspective. It's all just junk. Kagi has me as a paying customer forever.
  
  1 reply →
furyofantares 1 year ago
If you can't imagine a gap then I think it might not be for you.
I tried kagi after finally getting sick of some of my google results. Kagi was able to deliver on some of those results.
It's not like, shockingly better results. I do think they're better on average, but I'm not sure.
However, in the cases where I couldn't find what I was looking for on google and could on kagi, well, that's a binary result. I'll take the success and not the failure.
I was surprised by how much better I found the UI. That's actually the thing that sold me on the subscription to begin with. Going in, I would not have expected UI to sell me on such a thing.
I have since customized it somewhat; there are sites I usually really like results from and they are upranked, and sites I don't care for which are downranked. I've felt like this has lead to even better experience, but I haven't gone back to google to compare.
- perdomon 1 year ago
  
  Very interesting experience. Thanks for sharing. I’m a sucker for good UX/UI, and it sounds like the product is legitimately useful long-term. I wonder how well it integrates into Firefox/Chrome, since I see extensions for both browsers. I’ll check it out!
cube2222 1 year ago
Other than the things others mentioned, with the ultimate plan you can use Claude with Kagi web-search, which is not something that the Claude web app supports, I believe.
- perdomon 1 year ago
  
  That’s interesting — Claude WITH web search. Like Claude makes suggestions about search terms? Or maybe helps to navigate the results? I’m struggling to picture how google search is a 2-“person” job
  
  1 reply →
mulderc 1 year ago

For me, Kagi felt like a godsend as my Google searches had become polluted, and I had to dig way deeper in the search results to get anything decent. They also have an AI assistant that gives you access to all the major AI models and integrates with their search in a way that I have found very useful.
I would recommend at least giving it a try to see if you notice a difference. For my job, the monthly fee more than pays for itself
cmehdy 1 year ago
Auto filter for sources, downrank sources you dislike, sort results by recency, have an engine that actually respects what country or language you're trying to search into, and finally present results visually the way you want them. It's worth trying to use it actively for a month or so, and you'll see if you need it or not. I would not to back to google even if Google paid me.
- hedora 1 year ago
  
  I recently ran a search on Google. There were zero results on the first page. It was 100% ads. (15” MacBook).
  I tried a different search on iPhone to be sure. The first result was on the 3rd screen.
  When did they start doing that? How do people use that crap?
  
  2 replies →
jabroni_salad 1 year ago
Here is a case for 'not google':
https://i.imgur.com/PQNm1Yc.png
I want a search engine to return useful results. Right now, google has been captured by revenue generating results. It wouldn't be so bad if useful results were making money, but that doesn't seem to be the case.
- perdomon 1 year ago
  
  Yeah, those results are ass. Do you happen to have the Kagi equivalent of the search? I’m curious to see how they compare.
  
  2 replies →
yxre 1 year ago

[dead]

mortar 1 year ago

I’m not insinuating for even a second that Kagi actually do this, but as a general rule, isn’t any privacy claim dubious at the moment given that more and more governments appear to be able to compel companies to identify their users (especially those searching for illegal content) and further forcefully insist they not disclose it?

It’s disheartening to think the great progress we’re making in this sector could be undermined in a few seconds against any companies efforts with a trivial backdoor.

__MatrixMan__ 1 year ago

It depends on how hard those companies work beforehand to prevent themselves from being able to comply with such requests beforehand. Signal is a good example of this, Kagi seems to be onboard also.
I haven't looked closely enough at this token thingy Kagi is doing but it seems on the surface like it might scratch the itch by letting them decouple the accepting-payment part of their service from the providing-results part such that they know that you've paid, but not which payer you are.
sedatk 1 year ago
Government's power over companies does not negate cryptographic privacy protections. For example, one criminal who used ProtonMail got caught because ProtonMail handed over their recovery GMail address to the law enforcement after they were compelled[1]. However, that means end-to-end encryption worked: that was the only thing they could hand over. I think the same principle applies here.
[1] https://www.techradar.com/computing/cyber-security/proton-ma...
- autoexec 1 year ago
  
  The government forces companies to backdoor their systems and use compromised implementations of what would otherwise be private and secure systems (see for example https://en.wikipedia.org/wiki/Lavabit). It's also worth noting that the only thing preventing your searches being linked to your account via IP address and browser fingerprinting is to use Tor which conveniently will also not protect your from the US government either. Account settings can also link a person's searches to their account.
  The good news is that while the NSA will absolutely be tracking everything you search for while using Kagi they also do the exact same thing with every other search engine you use so what difference does it make.
  
  1 reply →
alexwebb2 1 year ago
I think the idea here is that it literally can't be traced to the user – at no point is there anything passed that would allow Kagi to make the association between the user and the query.
- mortar 1 year ago
  
  Thanks, yes completely agree! I guess the part I’m concerned with is the politically side whereby they could be potentially compelled to change the method slightly after the fact and be forced to slip something in somewhere in a quite technical process now making it possible.
  I’d love to assume this will never happen, I’m just concerned that even if it did I’d never find out - Because unfortunately the more popular this service gets for bad actors, the more of a target it becomes for the government with identification of users.
  I guess as a search engine, we could assume the government may leave them well alone and still just focus on content creators.
  
  1 reply →
- mortar 1 year ago
  
  I see this now, thanks for the clarity!
echoangle 1 year ago
Isn’t the whole point that this method is secure by design so even if they wanted, they couldn’t track you?
Or are you saying the method is designed to look secure but there’s an intentional weakness that makes tracking possible?
- mortar 1 year ago
  
  Definitely suggesting the method is secure, assuming the company does all the things they’ll say they do, which I also agree they’ll do. I’m just concerned the government can destroy this all, just by compelling them not to, and change a well intentioned method at any moment.
  
  5 replies →
ransom_rs 1 year ago

If the system is implemented correctly then Kagi cryptographically can't link a particular search to a particular user.
lukev 1 year ago

XKCD #538 strikes again, and definitely extends to forcing people to lie about algorithms and possible backdoors.
I don't think, however, that this means we need to give up on crypto entirely. Just... be aware of the threat model for what you're encrypting.

godelski 1 year ago

This seems cool, but I still think the pricing of kagi is rather steep. It is $5/mo for 300 searches a month, which is really going to get you under 10 a day... That's insufficient. Then $10/mo (or $108/yr) for unlimited.

I'm curious if anyone knows, are companies like Google and Microsoft making more than $10/mo/user? We often talk about paying with our data, but it is always unclear how much that data is worth. Kagi does include some numbers, over here[0], but they seem a tad suspicious. The claim is Google makes $23/mo/user, and this would make their service a good value, but the calculation of $76bn US ad revenue (2023) and $277 per user annually gives 274m users. It's close to 80% of the US population, but I though google search was about 90% of global. And I doubt that all ad revenue is coming from search. Does anyone know the real numbers? Googling I get inconsistent answers and also answers based on different conditions and aggregations. But what we'd be interested here is purely in Google /search/ and not anything else.

[0] https://help.kagi.com/kagi/why-kagi/why-pay-for-search.html

MyOutfitIsVague 1 year ago
I don't know nor do I really care what other search companies are making. I pay $10/month for Kagi because it works for me and it's good. I don't even care about Kagi as a company (I don't care about any company); their search works. It's a good product, and I'm happy to keep paying for it as long as it keeps being useful while all the free competitors are still terrible. I use about 2k searches per month.
edit: Even just the ability to rank, pin, and block domains alone is crazy useful. I never need to see Pinterest in any image search results again. If I see a crappy blog spam site, I just block it and it never shows up again. It feels like these are basic, fundamental features that every search engine should have had a long time ago. It's pretty sad that Kagi is getting so much praise for doing things that really should have been standard for at least a decade (not sad in any negative way toward Kagi, but because our standards and expectations for search have dropped this low).
- frereubu 1 year ago
  
  So funny that blocking Pinterest comes up in all discussions on Kagi (I've mentioned it myself in the past). I almost think people might pay $1 a month just to block Pinterest.
  
  1 reply →
- godelski 1 year ago
  
  Honestly, this was a better ad for Kagi than what I got from the site. I'll actually check it out. Thanks
redserk 1 year ago
$10 felt a bit steep until I realized there is probably the economies of scale at play here.
1) There is a marginal payment overhead. I'd assume $0.50-0.75, leaving their amount down to $9-ish.
2) It's a fairly niche product with a still-small userbase. ~40k users at ~$9/mo = $360k/mo (I know there's $5/mo users and $25/mo users but I'd assume there are far more $5/mo and $10/mo users than $25/mo users)
3) They have to keep the service running 24/7/365, so you have to hire devs either across multiple time-zones or compensate them enough to be OK fighting fires at 2am.
- autoexec 1 year ago
  
  As the user of a service things like payment overhead, a small userbase, and dev salaries aren't my problem. My only concern is what I'm getting for what I'm paying.
  $5 a month for fewer than 10 searches a day is clearly not a good deal. $10 a month might be worth it for some, but an extra $15 a month on top of that for AI results is kind of crazy.
  
  7 replies →
hedora 1 year ago

If you can pay $10/month for a better search experience, then Google's making way more than that much off your data.
Kagi saves me much more than $10 of time every month. I definitely don't regret the subscription cost. Their LLM thing (append "?" to your internet search query) is worth more than that on its own.
dcow 1 year ago

I’ve been paying for Kagi for a long time through all their pricing model changes and updates. I have never once hit the search limit. I know they base their tiers on market research of search volume balanced against cost of serving a query. If you’re looking for reasons not to pay for search, you’ll find them. But the pricing model is hardly one. If you want an amazing and respectful search experience, and want to back a company that’s truly doing right by users and innovating at the same time, give Kagi a try!
atonse 1 year ago

I support them ($10/mo) because they do a good job and I figured, if I pay, then the likelihood of them using sketchy ways of making money is reduced.
daft_pink 1 year ago

The reason why it's worth it is because its search works really well. I've tried DuckDuckGo, Bing and always subconsciously ended up back at Google. This is the only search service I've used that works better than Google search and I think it's a combination of them not putting ads on the search and the way they let you tweak the search to block poor quality sites. How much it costs them or how much google profits vs your payment is not really relevant to me. It's the best working search engine in my opinion.
sedatk 1 year ago

Kagi Ultimate plan ($25/mo) includes Kagi Assistant with more than 15 different models (including Claude 3.5 Sonnet, Gemini 2.0, ChatGPT 4o + o3 mini, DeepSeek R1 etc). That plan suddenly becomes the cheapest, IMHO. I know that paid versions of LLM services offer more advanced models, but you at least get ahead of the rate limits this way.
HanClinto 1 year ago

I subscribe to an unlimited family plan. When considering how much cleaner my web experience is, it's a no-brainer. Default search engine on all our phones and devices.
They're my portal to the web. It's less like an optional web service (like a streaming service), and it feels more like I'm paying for them to be my ISP.
RussianCow 1 year ago

I pay $10/month just to have search results that aren't littered with SEO spam. The time savings alone make it totally worth it for me. Everything else is a giant bonus.
thoughtpalette 1 year ago
FWIW I signed up about 4 months ago on the starter plan and I'm definitely going to run over. I could be smarter about my searches though. I've switched to kagi on ALL of my devices, including work devices. And I could have searched to using google for most gifts/maps stuff instead.
some anecdotal data:
11/2024: 183 searches
12/2024: 360
1/2025: 376
2/2025: already at 222
Will definitely (happily) have to upgrade to the $10 plan. It's been great.
- asukachikaru 1 year ago
  
  I subscribed to kagi's $5/m plan since last March, and my usage until now is around 3.3k searches, with the monthly distribution similar to yours. Some months it's more, some months it's fewer.
  Currently I'm debating with myself if I should go for the $10 plan. I'm all down for supporting kagi, but surprisingly I didn't use as many searches as I thought.
  
  1 reply →
red_hare 1 year ago

You can't expect world percentages to match US percentages. The US is only 5% of the world's population and has a very different relationship to search. Also, only 63% of the world is online, so what does "90% of global" even mean?
Back-of-the-envelope:
- 2tn searches per year.
- US is 20% of all searches.
- Us revenue is 76bn
$76bn / (2tn * 0.2) = $0.19 / search
So, getting 300 searches for less than $0.02 per search sounds like a pretty good deal.
freedomben 1 year ago

I thought this too but at this point I've been subscribed well over a year. On a typical workday I might use 20+ searches, but I frequently use little to no searches on weekends and holidays, etc. Ultimately I end up using right around 300 per month (averaged out across the year), so I think their pricing isn't as wild as it initially looks.
flkiwi 1 year ago

It's just about the best Internet-related money I spend. I get fast, quality results on a service that doesn't obviously bend over backwards to monetize me. Ironic, in a way. I thought it was spendy at first, and now I can't imagine cancelling my subscription.
karaterobot 1 year ago

I assume Kagi's customers (of which I've been one since 5/22) are apt to value retaining their privacy more than Google values selling their data. That is to say, it's worth more than $10 (and more than $23 a month) for me to believe my data isn't being sold to advertisers. If you don't take that position, or set different values on it, I can certainly see why $5 or $10 a month wouldn't be worth it to you.
There's also the matter of Google search quality being increasingly bad, while Kagi's is consistently... okay. They also have a a lot of nice features, liking being able to change the weight of different sites in your list of results.
yoshicoder 1 year ago
I don't have exact numbers, but I wouldn't be surprised if 80-90% of google ad revenue comes from the ad prices they can charge for US users. I would be shocked if the percentage was less than 50-60% of revenue from US alone, which would put the value extraction per user for google at ~10$/month/user
- godelski 1 year ago
  
  Sorry, I mean that the revenue seems to not just be search ad revenue but ad revenue. Google's ad revenue comes from a lot of places, such as in your Android app. I assume it also includes adsense and other things.
jorvi 1 year ago

> This seems cool, but I still think the pricing of kagi is rather steep. It is $5/mo for 300 searches a month, which is really going to get you under 10 a day... That's insufficient.
You can split your searches with search engine shortcuts on the desktop, and the search engine quickbar on mobile.
When I still was on the starter plan, I used Kagi whenever I had a search that if I use google, I know I will:
- get a bunch of listicles and AI slop (Kagi downranks and bundles these)
- get a buch of AI images (again, Kagi clearly labels and downranks these)
- have to do multiple google searches for, but can instead use Quick Answer for
- will get a bunch of Reddit pre-translated results for
- technical / scientific questions, because of the sites I can uprank/downrank/block
I used google for things like:
- highest building in the world
- $bandname Wikipedia / Discogs
- name of thing I can't remember but have the approximate word for
You get the idea.
BeetleB 1 year ago

Depends on how you use it. For non-developers, under 10 searches per day on average sounds right. Not everyone has a job where they sit on a computer all day.
For me, I use Kagi only at home for personal use. And most months, I don't exceed 300. Of course, if I included work related searches, then yes - 10 searches won't get me far.
themadturk 1 year ago

I've been paying $5 a month for over a year and have hit the 300 search limit only once. I feel like I'm pretty active on the web, but perhaps I just have days where I don't search as often as others.
bqmjjx0kac 1 year ago

Wow! I wouldn't be surprised if I make more than 100 searches per day.

baggachipz 1 year ago

This should placate any potential subscribers who worry that their searches could be logged. Another great feature from a product which keeps getting better all the time.

retrorangular 1 year ago

Ironically, I couldn't access this page while using Tor. Reading it from another device though, it seems like a great announcement, I love the idea.

I also would be interested in a service like this for attestation on other sites. Device attestation has chilling privacy implications, but if you could have a paid service with a presumably trusted entity like Kagi attest that you are a legitimate user (but hide your identity), maybe more of the Internet could be browsed anonymously, while still minimizing spam.

I get why many sites currently block Tor and VPN users, or even users in incognito or without a phone number, as the Internet is essentially unusable without anti-spam measures. That said, I do think anonymity has its place (especially for browsing, even if commenting weren't allowed), and maybe ideas like this could allow for anonymity without the Internet being riddled with spam.

tansan78 1 year ago

I am just thinking there might be other better ways to preserve user's search privacy: using LLM embeddings (https://en.wikipedia.org/wiki/Word_embedding).

The browser creates embeddings of user query, then send the embeddings to the server.

To complete a search, the server is a machine and it does not really need text to understand what a user want. A series of numbers, like LLM embeddings, are totally fine (actually it might even be better, because embeddings map similar words closely, like Duck and Bird have similar embeddings).

On the privacy side, LLM embeddings are a bunch of numbers. Even the embeddings are associated with a user, other people cannot make meaning out of the embeddings. Therefore the user's privacy is preserved.

What do you think?

3s 1 year ago

This is an interesting idea, and indeed such an approach to providing privacy has been formalized to different degrees and varying levels of success (eg. [1][2]).
[1] https://arxiv.org/abs/1204.2136 [2] https://arxiv.org/abs/2210.03458
Unfortunately, as described, such a solution would only satisfy a somewhat meaningless notion of privacy. Specifically, the embeddings by definition contain potentially private information about the user, revealing things like "I'm asking about birds" to use your example. Even though it might "compress" the query in a slightly lossy way, it would still reveal a great deal of information about the query.
A true solution to this problem would require something like differential privacy and adding noise to the embeddings. However, the noise required would (likely) end up destroying too much information from the embedding to preserve accuracy of the LLM.
VHRanger 1 year ago

While this is a neat sounding idea, there's a few issues here:
1. Embeddings are very close to a reversible function. It's not hard to take an embedding and query back the closest query semantically from the source LLM.
2. We already don't log any queries from the users. I'm aware this has to be taken on faith from Kagi users. But you can believe that not having any user query data at all anywhere is a significant speed bump for us in feature development, bug tracking and roadmapping.

beeflet 1 year ago

It's a shame that their $5 tier is only 300 searches/month, or 10 searches per day. It's kind of ridiculously low. I could burn through half of that in a single day of debugging

Also I just tried it and you can't really search for porn

dingnuts 1 year ago

that's because you're a computer professional and the professional plan is more appropriate for you.
I gave a hundred searches to some normies I know and they told me that they save them to use when Google can't find what they want because Kagi works better (!) so they hoard the searches as backups to Google.
All I know is that I haven't used Google on purpose for a year and it's really turned into an eyesore
mulderc 1 year ago
Unlimited is only $10, more than worth it for me.
- beeflet 1 year ago
  
  I will try it for more technical queries, but the results seem to be worse than Google/DDG. The reverse image search sucks and has never returned a single result for me. I suppose the advantage is that it whitelists/blacklists against spam, but it frequently comes up empty handed. So maybe it needs to crawl/scrape the web deeper or it needs a better search algorithm or trust model.
  I could probably get better results by crawling the web myself at home.
  
  1 reply →

Thorrez 1 year ago

>We are working on enabling this feature for Trial and Starter plans, which have access to a limited number of monthly searches. [...] and theoretically, users on this plan could redeem more tokens than the limit of searches allowed on their plan (again, we do not know who the user redeeming the tokens is, or what plan they are on).

This doesn't make sense. Is this saying they're worried about a user on a trial or starter plan using someone else's tokens? That can still happen when tokens are disabled for trial and start plan users: the trial or starter plan user can use tokens generated by someone on an unlimited plan.

dalenw 1 year ago
They're saying that when someone uses a privacy token to search, they cannot track which account it's tied to. Therefor, they cannot track limits & billing.
- Thorrez 1 year ago
  
  The whole point of privacy tokens is you don't need to track usage. You track generation instead. They can track limits and billing at generation time. They know this. The sentence I previously quoted says so in the [...] section:
  > We are working on enabling this feature for Trial and Starter plans, which have access to a limited number of monthly searches. Therefore, they risk a worse user experience if their generated tokens are lost (for example, due to uninstalling the extension)
  If they don't track limits and billing at generation time, then there's no "risk [of] a worse user experience if their generated tokens are lost".
  This ""risk [of] a worse user experience if their generated tokens are lost" is a logical reason to not enable privacy tokens for Trial and Starter plans. The risk of "users on this plan could redeem more tokens than the limit of searches allowed on their plan" is not a logical reason to not enable privacy tokens for Trial and Starter plans.
- solardev 1 year ago
  
  What's to stop a single unlimited user from generating and sharing infinite tokens with the rest of the world, then?
  Edit: Each account is limited to 2000 tokens per month, unless they email support to request more. (from the FAQ near the bottom).
  Still seems like there could be a secondary market for resold tokens, though. Not just as a money making system, but possibly as a privacy initiative? If enough accounts pooled tokens into a shared pool and withdrew a random one from it each time, it would be a further safeguard.

eviks 1 year ago

Cool feature, curious, does the bangs solution (! Language ! Region ! Safe search) to the inability to use your config not introduce the same privacy concerns?

> allowing users to send a small configuration with every request (language, region, safe-search) to automatically customize your search experience to some extent. However, we currently believe this would quickly result in a significant loss of anonymity for you and for other users.

> For manual search settings customization, you can always use bangs in your search query to enable basic settings for a specific query.

pzo 1 year ago

Wouldn't this allow people to abuse and share token amount many users? People used to buy one subscription and share among friends e.g. Netflix but only shared amount family or friends - you wouldn't want random people know what movies you like to watch. Similarly people would less likely share their chatgpt plus account because everyone can see history and change settings. But with such privacy pass those issues don't exists I guess so more likely this can be abused.

kurtoid 1 year ago

> There's a monthly limit of 2,000 tokens per account to prevent abuse
https://help.kagi.com/kagi/privacy/privacy-pass.html
So I guess you _could_ share them, but only with so many people or so many searches.

1propionyl 1 year ago

In general I am a very happy Kagi subscriber, but I have noticed in the past month or so that sometimes my searches (via the Safari extension) just hang. Navigating it kagi.com also hangs.

When I'm in a rush this forces me to fall back to Google which often doesn't provide good results for my queries, which is unfortunate

It generally resolves itself in under a minute, but it is still a mildly irritating availability issue that wasn't present earlier. Maybe something to do with load balancing? No clue.

freediver 1 year ago
We haven't heard about this issue before, do you mind reaching out to support@kagi.com with more details so we can debug?
- 1propionyl 1 year ago
  
  Sure, I'll try to catch it in action and send you something useful.
themadturk 1 year ago

Just as a data point, I haven't seen this in Safari on desktop or mobile, or in Edge (on my Windows work machine).

AlotOfReading 1 year ago

Pretty cool feature. The unstated downside is that any personalization settings like dark mode, translation, and lens settings are still seemingly tied to account login.

freediver 1 year ago
> The unstated downside
It is clearly stated in the blog post :)
We even considered variations of having some settings preserved in local storage and impact of that on anonymity. Ultimately decided that was not worth it.
Check the FAQ section (towards the end) for full details and analysis:
https://blog.kagi.com/kagi-privacy-pass#faq
- Melatonic 1 year ago
  
  You could have a few built in options (like for domain filtering and customisation) for the privacy people. Could even be community sourced so there's no onus on Kagi itself.
  So for example there could be a built in "developers" preset that might make domains useful to coding higher ranked (and down rank or block things like stack overflow clones). Etc etc.
  Basically this could allow a smaller amount of customisation with less ability to identify a specific user.
  I also use Orion and I do like the idea someone else had of integrating an option for Kagi Privacy mode into the "incognito" tabs specifically as an option!
rafram 1 year ago
Couldn't those be passed as query parameters?
Though those still get passed to the server, and your combination of personalization settings is likely to be globally unique, and it's almost certainly unique among the subset of users that are paranoid enough about their privacy not to store preferences in their session... But still.
- freeAgent 1 year ago
  
  If you did that, it would partially to nearly fully de-anonymize the searcher though (assuming your parameters are unique or near-unique).
autoexec 1 year ago

Not only that but your searches themselves likely give Kagi more than enough information to identify you as an individual. We saw that nearly 20 years ago when AOL searches were published and people were able to track down specific individuals from their search terms. (https://www.nytimes.com/2006/08/09/technology/09aol.html)

alepacheco-dev 1 year ago

Privacy Pass is great for reducing friction, but it still relies on trust in the issuer. A ZK-based approach (e.g., using zk-SNARKs or anonymous credentials) could let users prove they’re paid subscribers without revealing their identity or even interacting with Kagi’s servers beyond the initial proof. This would remove the need for trust while keeping the experience just as seamless. Would love to see more services explore this direction.

FiloSottile 1 year ago

Privacy Pass is an anonymous credential scheme that does exactly what you describe.

devwastaken 1 year ago

This is meaningless with a paid for service. Kagi can and will be required by court order to collect and track a subscribers use. It doesn't matter if your tech is supposed to be “anonymous”, they have to make it happen. You must operate in a country that cant compel this and be willing to jump ship when your nsa equivalent gets upset.

italiancheese 1 year ago

I want to pay for Kagi, but it's priced way too high (for me). Would love it if they implemented Purchasing Power Parity (PPP).

eudhxhdhsb32 1 year ago

Where do you live and how much lower do you expect them to go?
Starting at $5 a month seems very reasonable to me for a non-essential premium search experience.
xigoi 1 year ago

https://kagifeedback.org/d/687-implement-regional-pricing/5
> Regional, student, annual...discounts are not possible because we are not currently making any profit to discount it off from.
thibaultmol 1 year ago
problem is that they have small margins because they have to pay a lot for their upstream providers (and those don't care about what region kagi users are from so charge the same)
- bugtodiffer 1 year ago
  
  and they use all of their margin to build a browser because why not

nottorp 1 year ago

> Privacy Pass does not rely on any blockchain technology.

Lovely!

aryonoco 1 year ago

The way I feel about Kagi reminds me of how I felt about Google circa 1999.

The cynic in me wants to stop myself from becoming a fanboy because we all know how the story of tech startups goes and it's bound to repeat itself again; the optimist in me still wants to believe that there can be forces of good on the web.

sebastiangula 1 year ago

I can't explain this rationally, but my brain just rejects any other search engine. For example, in the case of Kagi, black links instead of blue feel off-putting. I guess years of using Google has hardwired into my brain how a search engine should look.

DavideNL 1 year ago

> For example, in the case of Kagi, black links instead of blue feel off-putting.
Not exactly sure what you're getting at, but:
"Using CSS, you can fully customize Kagi's search and landing pages from your Appearance settings." : https://help.kagi.com/kagi/features/custom-css.html

cyanydeez 1 year ago

When does the API. become GA. I really want a reason to spend on a valuable search engine.

freediver 1 year ago
In about two months!
- cyanydeez 1 year ago
  
  Cool. I think I can get my work to pay if I can get an API too.

agnishom 1 year ago

This is really cool. How does the cryptography work?

EDIT: Seems like it works via https://en.wikipedia.org/wiki/Blind_signature

amazingamazing 1 year ago

Do people have news articles of Microsoft, Google et al using search history to the searcher's detriment? How frequently does this happen? I'm imagining it must be like one in a million.

Bromeo 1 year ago
Per-user search history can directly be sold to advertisers, no? I was under the impression Google and Microsoft do that, or at least use it internally to build a profile of each user, again used for advertising.
- amazingamazing 1 year ago
  
  Which search engine sells results directly?

Amekedl 1 year ago

I've read the post and saw the references, but can anyone easily explain how it works? Can the server not just store whatever is generated to link it back to the user using it?

tonymet 1 year ago

I hope Protonmail picks up on privacy pass. It would be nice to fund anonymous Protonmail accounts with an anonymous “benefactor” account.

alepacheco-dev 1 year ago

I think you could use zero knowledge proofs here to accomplish the same thing but without having to worry about renewing tokens etc

ransom_rs 1 year ago

A problem with "zero knowledge" proofs is that Kagi needs to verify that the user has paid for the service, which requires the server to have some knowledge about the client at some point.

ThePowerOfFuet 1 year ago

Alas, not compatible with Firefox for Android

rawkode 1 year ago

I wonder how this affects gated features and search limits?

AlotOfReading 1 year ago
That's one of the the technical limitations behind gating it to unlimited accounts for now.
- bloomingkales 1 year ago
  
  They could embed the subscription level into the blind signature.
  
  2 replies →

hollowturtle 1 year ago

Just out of curiosity, how is Kagi(the company) financially going? And his fortune really is because google search sucks so hard? Genuine curiosity

throwaway290 1 year ago

This is a nice move. Now they just need to stop paying Yandex/Russia and I'd be back...

bugtodiffer 1 year ago

I'm willing to bet they have not gotten this tested by security professionals

aspensmonster 1 year ago

Seeing as I'm not getting any traction in the fediverse (https://tenforward.social/@aspensmonster/113999217587309328), maybe I can ask here instead.

=================================

From their blog:

>As standardized in [2 - 4], the Privacy Pass protocol is able to accommodate many “architectures.” Our deployment model follows the original architecture presented by Davidson et al. [1], called “Shared Origin, Attester, Issuer” in § 4 of [2].

From [2] RFC 9576 § 3.3 "Privacy Goals and Threat Model" :

>Clients explicitly trust Attesters to perform attestation correctly and in a way that does not violate their privacy. In particular, this means that Attesters that may be privy to private information about Clients are trusted to not disclose this information to non-colluding parties. Colluding parties are assumed to have access to the same information; see Section 4 for more about different deployment models and non-collusion assumptions. However, Clients assume that Issuers and Origins are malicious.

And From [2] RFC 9576 § 4.1 "Shared Origin, Attester, Issuer" :

>As a result, attestation mechanisms that can uniquely identify a Client, e.g., requiring that Clients authenticate with some type of application-layer account, are not appropriate, as they could lead to unlinkability violations.

Womp womp :(

This is not genuinely private in any meaningful sense of the term. Kagi plays the role of all three parties, and even relies on the very thing section 4.1 says is not appropriate: to use mechanisms that can uniquely identify a client. They utilize a client's session token: "In the case of Kagi’s users, this can be done by presenting their Kagi session cookie to the server."

Frankly, that blog post is disingenuous at best, and malicious at worst.

=================================

I want to be wrong here. Where am I wrong? What am I missing?

jorams 1 year ago

From [2] RFC 9576 § 4.1 "Shared Origin, Attester, Issuer", right before the sentence you quoted:
> In this model, the Attester, Issuer, and Origin share the attestation, issuance, and redemption contexts.
I haven't read the RFC in detail, but I believe this is where the nuance is: When you enable the privacy pass setting in the extension/browser the redemption context is changed relative to the attestation context by removing the session cookie, to just the information sent by the browser for someone who is not logged in. What remains is your IP address and browser fingerprinting, which can be countered by using Tor.
nulld3v 1 year ago
This would definitely seem like a big concern if you were just looking at the RFC, but the key here is that Kagi's system has a different set of security/privacy/functional requirements and therefore the issues mentioned in the RFC do not necessarily apply.
In the RFC's architecture, the request flow is like so:
1. CLIENT sends anonymous request to ORIGIN
2. ORIGIN sends token challenge to CLIENT
3. CLIENT uses its identity to request token from ISSUER/ATTESTER
4. ISSUER/ATTESTER issues token to CLIENT
5. CLIENT sends token to ORIGIN
You can see how the ISSUER/ATTESTER can identify the client as the source of the "anonymous request" to the ORIGIN because the ISSUER, ATTESTER and ORIGIN are the same entity, so it can use a timing attack to correlate the request to the ORIGIN (1.) with the request to the ISSUER/ATTESTER (3.).
However you can also see that if a lot of time passes between steps (1.) and (3.), then such an attack would be infeasible. Reading past your quote from RFC 9576 § 4.1., it states:
> Origin-Client, Issuer-Client, and Attester-Origin unlinkability requires that issuance and redemption events be separated over time, such as through the use of tokens that correspond to token challenges with an empty redemption context (see Section 3.4), or that they be separated over space, such as through the use of an anonymizing service when connecting to the Origin.
In Kagi's architecture, the "time separation" requirement is met by making the client generate a large batch of tokens up front, which are then slowly redeemed over a period of 2 months. The "space separation" requirement is also satisfied with the introduction of the Tor service.
There is some more discussion in RFC 9576 § 7.1. "Token Caching" and RFC 9577 § 5.5. "Timing Correlation Attacks".
One question you may have is: Why wasn't this solution used in the RFC?
This can be understood if you look at the mentions of "cross-ORIGIN" in the RFC. This RFC was written by Cloudflare, who envisioned it's use across the whole Internet. Different ORIGINs would trust different ISSUERs, tokens from one ORIGIN<->ISSUER network might not work in another ORIGIN<->ISSUER network. This made it infeasible for clients to mass-generate tokens in advance, as a client would need to generate tokens across many different ISSUERS.
Of course, adoption was weak and there ended up being only one ISSUER - Cloudflare, so they adopted the same architecture as Kagi where clients would batch generate tokens in advance (batch size was only 30 tokens though).
RFC 9576 § 7.1. also mentions a "token hoarding" attack, which Cloudflare felt particularly threatened by. Cloudflare's Privacy Pass system worked in concert with CAPTCHAs. Users could trade a completed CAPTCHA for a small batch of tokens, allowing a single CAPTCHA completion to be split into multiple redemptions across a longer time period.
However, rudimentary "hoarding"-like attacks were already in use against CAPTCHAs through "traffic exchanges". Opening up another avenue for hoarding through Privacy Pass would have only exacerbated the problem.
- aspensmonster 1 year ago
  
  >3. CLIENT uses it's identity to request token from ISSUER/ATTESTER
  The ISSUER and ATTESTER are different roles. As previously quoted, "Clients explicitly trust Attesters to perform attestation correctly and in a way that does not violate their privacy." The RFC is explicit that, when all of the roles are held by the same entity, the attestation should not rely on unique identifiers. But that's exactly what a session cookie is.
  >You can see how the ISSUER/ATTESTER can identify the client as the source of the "anonymous request" to the ORIGIN because the ISSUER, ATTESTER and ORIGIN are the same entity, and therefore it can use a timing attack to correlate the request to the ORIGIN (1.) with the request to the ISSUER/ATTESTER (3.).
  No timing or spacing attack is needed here. If I have to provide Kagi with a valid session cookie in order to get the tokens, then they already have a unique identifier for me. There is no guarantee that Kagi is not keeping a 1-to-1 mapping of session cookies to ISSUER keypairs, or that Kagi could not, if compelled, establish distinct ISSUER keypairs for specific session cookies.
  
  3 replies →

ijul 1 year ago

hekdaimon Al

voytec 1 year ago

[deleted by author]

fvirdia 1 year ago

I believe you should currently be able to - create an account under a pseudonymous email address - pay for a plan using a pseudonymous Bitcoin wallet - use your login session to generate Privacy Pass tokens - search with such tokens via the Tor browser on Kagi's .onion domain
ransom_rs 1 year ago
You have to generate the tokens while signed in, but once you have the tokens, you can use them without your searches being associated with your account (cryptographically provable).
- voytec 1 year ago
  
  [deleted by author]
  
  2 replies →

bteamfilms 1 year ago

[dead]

korjavin 1 year ago

[dead]

Kralxxx_666 1 year ago

[flagged]