Comment by maronato
19 hours ago
Or it was trained to be aligned with Musk by receiving higher rewards during reinforcement learning steps for its reasoning.
19 hours ago
Or it was trained to be aligned with Musk by receiving higher rewards during reinforcement learning steps for its reasoning.
No comments yet
Contribute on Hacker News ↗