Comment by potatoman22

5 months ago

Small point but is it doing semantic chunking, or loading the entire pdf into context? I've heard mixed results on semantic chunking.

8 comments

potatoman22

panarky 5 months ago

It loads the entire PDF into context, but then it would be my job to chunk the output for RAG, and just doing arbitrary fixed-size blocks, or breaking on sentences or paragraphs is not ideal.

So I can ask Gemini to return chunks of variable size, where each chunk is a one complete idea or concept, without arbitrarily chopping a logical semantic segment into multiple chunks.

thelittleone 5 months ago
Fixed size chunks is holding back a bunch of RAG projects on my backlog. Will be extremely pleased if this semantic chunking solves the issue. Currently we're getting around an 78-82% success on fixed size chunked RAG which is far too low. Users assume zero results on a RAG search equates to zero results in the source data.
- refulgentis 5 months ago
  
  FWIW, you might be doing it / ruled it out already:
  - BM25 to eliminate the 0 results in source data problem
  - Longer term, a peek at Gwern's recent hierarchical embedding article. Got decent early returns even with fixed size chunks
  
  2 replies →
- jacobr1 5 months ago
  
  Also consider methods that are using reasoning to potentially dispatch additional searches based on analysis of the returned data
- nnurmanov 5 months ago
  
  This is my problem as well; do you have lots of documents?
Tostino 5 months ago

I wish we had a local model for semantic chunking. I've been wanting one for ages, but haven't had the time to make a dataset and finetune that task =/.