Comment by kevin0091
1 month ago
The reading list is old about one year, for instance in 2025, one may use KTO for math, RLOO for CoT, DPO for function calling and optimization.
In 2025 one should only focus should be distillation & optimization.
In 2025 CoT is not new, the corrected CoT is the key and all you need.
No comments yet
Contribute on Hacker News ↗