Comment by mjburgess
10 months ago
People repairing chatgpt replies with additional prompts is reinforcement learning training data.
"Reinforcement learning", just like any term used by AI researchers, is an extremely flexible, pseudo-psychological reskin of some pretty trivial stuff.
No comments yet
Contribute on Hacker News ↗