Comment by Sharlin
2 hours ago
Given that this specific style is the result of being reinforced over and over again via RLHF, "inadvertently" isn't really the word I'd use.
2 hours ago
Given that this specific style is the result of being reinforced over and over again via RLHF, "inadvertently" isn't really the word I'd use.
No comments yet
Contribute on Hacker News ↗