Comment by btbuildem

5 days ago

A great hack/shortcut for solving this "memory" problem is to have a rolling RAG KB. You don't fill up the context, and you can use a re-ranking model to further improve accuracy.

Aside from all that, using npm for distribution makes this a total non-starter for me.