Comment by reissbaker

6 days ago

Clickbait headline. "Fine-tuning LLMs for knowledge injection is a waste of time" is true, but IDK who's trying to do that. Fine-tuning is great for changing model behavior (i.e. the zillions of uncensored models on Hugging Face are much more willing to respond to... dodgy... prompts than any amount of RAG is gonna get you), and RAG is great for knowledge injection.

Also... "LoRA" as a replacement for finetuning??? LoRA is a kind of finetuning! In the research community it's actually referred to as "parameter efficient finetuning." You're changing a smaller number of weights, but you're still changing them.

They provide no references other than self-referencing blogs. It was also suspenseful to read about loss in changing neural network weights when there was 0 mention of quantization. Unfortunately, most of the content in this one was taken from his own previous work.

RAG is getting some backlash and this reads as a backlash of the backlash. I hope things settle down soon but many techfluencers put all their eggs in RAG and used it to gatekeep AI.

> "Fine-tuning LLMs for knowledge injection is a waste of time" is true, but IDK who's trying to do that.

Have people who say this ever actually done it? It works. It works pretty well.

I have no clue why this bad advice is so routinely parroted.

  • It technically works with enough data but it's pretty inefficient compared to RAG. However, changing behavior via prompting/RAG is harder than changing behavior via finetuning; they're useful for different purposes.