Comment by torginus

18 days ago

That's not really my experience. Error rate goes up the more stuff you cram into the context, and processing gets both slower and more expensive with the amount of input tokens.

I'd say it makes sense to do RAG even if your stuff fits into context comfortably.