← Back to context

Comment by gapeslape

17 days ago

In my mind, Gemini 2.0 changes everything because of the incredibly long context (2M tokens on some models), while having strong reasoning capabilities.

We are working on compliance solution (https://fx-lex.com) and RAG just doesn’t cut it for our use case. Legislation cannot be chunked if you want the model to reason well about it.

It’s magical to be able to just throw everything into the model. And the best thing is that we automatically benefit from future model improvements along all performance axes.

What does "throw everything into the model" entail in your context?

How much data are you able to feed into the model in a single prompt and on what hardware, if I may ask?

  • Gemini models run in the cloud, so there is no issue with hardware.

    The EU regulations typically include delegated acts, technical standards, implementation standards and guidelines. With Gemini 2.0 we are able to just throw all of this into the model and have it figure out.

    This approach gives way better results than anything we are able to achieve with RAG.

    My personal bet is that this is how the future will look like. RAG will remain relevant, but only for extremely large document corpuses.

Maybe a dumb question, have you tried fine tuning on the corpus, and then adding a reasoning process (like all those R1 distillations)?

  • We haven't tried that, we might do that in the future.

    My intuition - not based on any research - is that recall should be a lot better from in context data vs. weights in the model. For our use case, precise recall is paramount.