Comment by torginus

5 months ago

That's not really my experience. Error rate goes up the more stuff you cram into the context, and processing gets both slower and more expensive with the amount of input tokens.

I'd say it makes sense to do RAG even if your stuff fits into context comfortably.