Comment by rororournouh

18 hours ago

I’m given to understand that Anthropic uses something called Constitutional AI, where there is a central document of desirable and undesirable qualities (as well as reinforcement learning) whereas OpenAI relies more heavily on direct human feedback and rating with human trainers evaluating responses and the model conforming to those preferences.

I also much prefer the output of Claude at present.

3 comments

rororournouh

kccqzy 17 hours ago

Yeah and for much of the HN crowd, we aspire to have better tastes than the average. So if the supervised learning uses average human trainers it will most likely be seen as having poor taste for much of HN.

vasco 8 hours ago
Speak for yourself my taste is average and I aspire for it to remain so.
- bluGill 1 hour ago
  
  I aspire to improve the average. Which I can do either by being much better than average, or by improving everyone else just a little.