← Back to context

Comment by Davidzheng

5 hours ago

"post-training shaping the models behavior" it seems from your wording that you find it not that dramatic. I rather find the fact that RL on novel environments providing steady improvements after base-model an incredibly bullish signal on future AI improvements. I also believe that the capability increase are transferring to other domains (or at least covers enough domains) that it represents a real rise in intelligence in the human sense (when measured in capabilities - not necessarily innate learning ability)