← Back to context

Comment by jauntywundrkind

7 hours ago

Here here. Code has uniquely an incredible volume of data. And incredibly good ways to assess & test it's weights, to immediately find out of its headed the right way on the gradient.

> And incredibly good ways to assess & test it's weights

What weights are you referring to? How does [Claude?] code do that

  • Look into RLVR (Reinforcement Learning with Verifiable Rewards). It happens during model post-training.