Comment by gapeslape

5 months ago

In my mind, Gemini 2.0 changes everything because of the incredibly long context (2M tokens on some models), while having strong reasoning capabilities.

We are working on compliance solution (https://fx-lex.com) and RAG just doesn’t cut it for our use case. Legislation cannot be chunked if you want the model to reason well about it.

It’s magical to be able to just throw everything into the model. And the best thing is that we automatically benefit from future model improvements along all performance axes.

4 comments

gapeslape

pvo50555 5 months ago

What does "throw everything into the model" entail in your context?

How much data are you able to feed into the model in a single prompt and on what hardware, if I may ask?

gapeslape 5 months ago

Gemini models run in the cloud, so there is no issue with hardware.
The EU regulations typically include delegated acts, technical standards, implementation standards and guidelines. With Gemini 2.0 we are able to just throw all of this into the model and have it figure out.
This approach gives way better results than anything we are able to achieve with RAG.
My personal bet is that this is how the future will look like. RAG will remain relevant, but only for extremely large document corpuses.

manmal 5 months ago

Maybe a dumb question, have you tried fine tuning on the corpus, and then adding a reasoning process (like all those R1 distillations)?

gapeslape 5 months ago

We haven't tried that, we might do that in the future.
My intuition - not based on any research - is that recall should be a lot better from in context data vs. weights in the model. For our use case, precise recall is paramount.