Comment by thorum

19 days ago

Go read the DeepSeek R1 paper

10 comments

thorum

Why would I do that? If you know something then quote the relevant passage & equation that says you can train code generators w/ RL on a novel language w/ little to no code to train on. More generally, don't ask random people on the internet to do work for you for free.

thorum 19 days ago
Your other comment sounded like you were interested in learning about how AI labs are applying RL to improve programming capability. If so, the DeepSeek R1 paper is a good introduction to the topic (maybe a bit out of date at this point, but very approachable). RL training works fine for low resource languages as long as you have tooling to verify outputs and enough compute to throw at the problem.
- measurablefunc 19 days ago
  
  So you should have no problem bringing up the exact passages & equations they use for their policies.
- whimsicalism 19 days ago
  
  imo generally not worth it to keep going when you encounter this sort of HN archetype
whimsicalism 19 days ago
well, that’s one way to react to being provided with interesting reading material.
- measurablefunc 19 days ago
  
  Bring up passage that supports your claim. I'll wait.
  
  4 replies →