Comment by kgeist

4 months ago

Technically, RAG is anything that augments generation with external search. However, it often has a narrower meaning: "uses a vector DB."

Throwing everything into one large context window is often impractical - it takes much more time to process, and many models struggle to find information accurately if too much is going on in the context window ("lost in the middle").

The "classic" RAG still has its place when you want low latency (or you're limited by VRAM) and the results are already good enough.

0 comments

kgeist

No comments yet

Contribute on Hacker News ↗