Comment by trilogic
6 hours ago
Here is a dataset you can choose from: https://huggingface.co/datasets/Avtrkrb/combined-reasoning-o... Get a 10000 samples from it according to your needs and go for it. The key (in my opinion) is not cutting the Sequence Length among other things. Whatever traditional finetuning repo will do, if your hardware supports it Unsloth is faster.
No comments yet
Contribute on Hacker News ↗