Comment by simonw
10 hours ago
karpathy/fineweb-edu-100b-shuffle: https://huggingface.co/datasets/karpathy/fineweb-edu-100b-sh...
Which is derived from HuggingFaceFW/fineweb-edu: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu
HuggingFaceTB/smol-smoltalk: https://huggingface.co/datasets/HuggingFaceTB/smol-smoltalk
And extra fine-tuning on portions of:
cais/mmlu: https://huggingface.co/datasets/cais/mmlu
openai/gsm8k: https://huggingface.co/datasets/openai/gsm8k
allenai/ai2_arc: https://huggingface.co/datasets/allenai/ai2_arc
No comments yet
Contribute on Hacker News ↗