Comment by jimmy76615
2 months ago
Amazing model! I'm trying to get it to run on an ec2 machine right now, but it looks like a lot of the performance actually depends on more than just classical LLM inference. And it looks like Deepseek didn't share their scripts to do the parallel thinking traces and self-verification loops. Is anybody else working on recreating this right now?
Hi! Did you ever end up running this reproduction? If yes, could you also check if the Putnam/IMO problems are in the training data perhaps by trying to have it complete the problems n times? I would totally do this myself if I weren’t GPU poor!