Comment by nobodywasishere
4 hours ago
> didn't realize Kagi had no aspirations to build their own general purpose index
Kagi employee here. We're actively working on building our own indexes beyond the limited ones we have now, not just a general index but also purpose built indexes for things like programming, etc.
I did not intend to spread misinformation here, and would like to hear more about the general-purpose index Kagi is working on. I had based my comment on several Kagi pages, but mostly https://help.kagi.com/kagi/search-details/search-sources.htm..., which mentions Teclis as Kagi's own index, but https://teclis.com/ makes it pretty clear that it's a "small web"-focused tool:
> Teclis is an attempt to surface the less known web, the web of creativity and self expression, the more humane web.
> Teclis includes its own crawl as well as results from Kagi Small Web index and results with permission from Marginalia Search.
> Teclis works best with broad queries such as 'machine learning', 'vegan diet', 'religion' etc..
Is there another crawler doing the general-purpose stuff?
How broad will they be? Do you aim to ever have large scale indexing of the web?
Hey do you guys have posts or sharing about it? It would be awesome to see what you are trying to accomplish, maybe it's time to post on HN ;)
How do you build a search index in the days of Anubis pages everywhere?
So you will stop buying Yandex data at some point?
What are the challenges of doing that when so much of the internet has turned itself into SEO slop to fit Google's algorithms?
I imagine there is still a whole load of stuff out there on the internet that Google would never surface because it doesn't have enough adsense or whatever. Are you finding that?
Hurry. Google might give up the ghost on its search product and maintaining indices on anything not geared for LLMs.
I'm not sure antitrust will help you.
[flagged]
If a search engine starts censoring by whatever means, I'll not be using it, neither free or paid no matter how good it is (and by definition it can't be). Shills can provide a list of countries they don't want to see, hopefully some crooked search engine will satisfy their desire for censorship.
Then you have to stop using search engines.
Censorship of search results and deciding which companies to do business with are two completely separate topics.
Censorship? This isn't about free speech, there's a war on.
2 replies →
No search engines are uncensored, not even Yandex. Partly because they have to exist in a country somewhere, and partly because really nobody wants to index CSAM.
1 reply →
We are not talking about censoring anything here, just buying paid sources of index data.
I am unhappy with money flowing into Russia, for reasons that should be obvious (and I will not respond to whataboutism-style baiting here).
yandex provides objectively better results for many things than google/ddg/bing
The results are kind of weird, but it does have one advantage: either nobody bothers to DMCA them, or they ignore it, or both.
Western companies also bought strategically important Nazi Germany industrial products in the 1930s because they were considered superior. Commercial convenience and technical quality are not moral or geopolitical absolution. Would you buy cheap quality gold if the Nazis were selling it knowing what it supports?
1 reply →
What other countries do you want nothing to do with?
North Korea, Eritrea, a few others but Russia is the only one in the list getting paid for search indexes.
Maybe just the ones at full or hybrid war with Europe?
This is so funny, as though wanting to boycott specific entities is some kind of absurd notion, and as though saying "Sure, what ELSE don't you like?" is some kind of proof that it's an absurdity
3 replies →
[dead]
What's the point in stopping paying money to Russia if Kagi is incorporated in Palo Alto, so it's paying money and will continue to do so to a country causing no less troubles than Russia?
Two wrongs don’t make a right. Much of the HN audience lives in the US, giving money to the US is unavoidable in life. But in return we do have the democratic ability to try to alter the behavior of the country.
It still makes sense to avoid giving money to other bad actors who are acting in direct opposition to your home country, and whom you have no control over, when you can.
2 replies →
A lot more one could argue.
Ru-speaking audience is ~2 times bigger than Russia, why would they cut this? Only because of SJWs like you? I'd much prefer one search engine that searches well on two languages, instead of using different engines for each language. Ru-net has huge amount of usefulness in it, cutting it out is like cutting a finger. Fun fact, before the RU-UKR war, most Ukranians contributed to the Runet, so that would cut their heritage too.
Yandex isn't just a "Russian language search engine", it's a Russian company with quite close ties to the Kremlin. See: https://www.zois-berlin.de/en/publications/zois-spotlight/th...
I'm frankly a bit surprised that it's even legal for a US company like Kagi to do business with Yandex, considering it's sanctioned: https://sanctionssearch.ofac.treas.gov/Details.aspx?id=18711. Though in fairness, I don't know enough about how exactly sanction laws work so it might be legally okay even if I find it morally questionable.
4 replies →
> We're actively working on building our own indexes
Lip service. You'll have some token index of Wikipedia or something so you can say your results are "a blend of our own index and other sources".
Wikipedia is prob in "other sources", as they actually say they have a direct license for it.
https://blog.kagi.com/waiting-dawn-search#:~:text=Wikipedia,...