Comment by jauntywundrkind
10 hours ago
Here here. Code has uniquely an incredible volume of data. And incredibly good ways to assess & test it's weights, to immediately find out of its headed the right way on the gradient.
10 hours ago
Here here. Code has uniquely an incredible volume of data. And incredibly good ways to assess & test it's weights, to immediately find out of its headed the right way on the gradient.
> And incredibly good ways to assess & test it's weights
What weights are you referring to? How does [Claude?] code do that
Look into RLVR (Reinforcement Learning with Verifiable Rewards). It happens during model post-training.