Comment by kgeist
2 days ago
Technically, RAG is anything that augments generation with external search. However, it often has a narrower meaning: "uses a vector DB."
Throwing everything into one large context window is often impractical - it takes much more time to process, and many models struggle to find information accurately if too much is going on in the context window ("lost in the middle").
The "classic" RAG still has its place when you want low latency (or you're limited by VRAM) and the results are already good enough.
No comments yet
Contribute on Hacker News ↗