Comment by jauntywundrkind
8 hours ago
Here here. Code has uniquely an incredible volume of data. And incredibly good ways to assess & test it's weights, to immediately find out of its headed the right way on the gradient.
8 hours ago
Here here. Code has uniquely an incredible volume of data. And incredibly good ways to assess & test it's weights, to immediately find out of its headed the right way on the gradient.
> And incredibly good ways to assess & test it's weights
What weights are you referring to? How does [Claude?] code do that
Look into RLVR (Reinforcement Learning with Verifiable Rewards). It happens during model post-training.