Comment by taw1285

1 day ago

Your comment really helps me improve my mental model about LLM. Can someone smarter help me verify my understanding:

1) at the end of the day, we are still sending raw text over LLM as input to get output back as response.

2) RAG/Embedding is just a way to identify a "certain chunk" to be included in the LLM input so that you don't have to dump the entire ground truth document into LLM Let's take Everlaw for example: all of their legal docs are in embeddings format and RAG/tool call will retrieve relevant document to feed into LLM input.

So in that sense, what do these non-foundational models startups mean when they say they are training or fine tuning models? Where does the line end between inputting into LLM vs having them baked in model weights

(1) and (2) are correct (well, I don’t know specifics of Everlaw). Fine tuning is something different, where you incrementally train the model itself further using more inputs, so that given the same input context it will produce better output in your use case.

To be more precise, you seldom directly continue training the model, because it’s much cheaper and easier to add some more small layers to the big model and train those instead (see LoRA or Peft).

Something like Everlaw might do all three, by fine tuning a model to do better at discovery retrieval, then building a RAG system on top of that.