← Back to context

Comment by wongarsu

2 years ago

Not just a long prompt, a long prompt that asks difficult and often contradictory things in an effort to please rights holders and "diversity".

Another factor might be the alignment work that happens during training (in the RLHF phase). According to the GPT4 paper it does make the model perform worse on benchmarks. But it makes the model more politically correct and family-friendly. It's reasonable to assume that the process has evolved in newer GPT4 models.

Along with those factors, there's a reasonably convincing theory going around, complete with some successful experimentation, that training on scraped internet data (with various dates attached to comments, etc) results in seasonally affected answers based on whatever date the model 'thinks' it is when answering.