Comment by maronato
1 day ago
Or it was trained to be aligned with Musk by receiving higher rewards during reinforcement learning steps for its reasoning.
1 day ago
Or it was trained to be aligned with Musk by receiving higher rewards during reinforcement learning steps for its reasoning.
No comments yet
Contribute on Hacker News ↗