Comment by JKCalhoun
7 months ago
Is there a RAG for Wikipedia?
I may not be using the term correctly here. In short, I would love a local LLM + Wikipedia snapshot so that I can have an offline, self-hosted ... Hitchhiker's Guide to Earth.
7 months ago
Is there a RAG for Wikipedia?
I may not be using the term correctly here. In short, I would love a local LLM + Wikipedia snapshot so that I can have an offline, self-hosted ... Hitchhiker's Guide to Earth.
Huggingface has a few datasets of Wikipedia embeddings.
Here’s a few results: https://huggingface.co/search/full-text?q=Wikipedia+embeddin...
And the first result, which is probably what you’ll want to use: https://huggingface.co/datasets/Upstash/wikipedia-2024-06-bg...
I recommend you go for pgvector or a similar self hosted solution to calculate the similarities instead of a service like Vector.