Comment by fennecfoxy
2 months ago
In a way it's the same thing as finding that models got lazier closer to Christmas, ie the "Winter Break" hypothesis.
Not sure what caused the above but In my opinion not only is the training affected by the date of training data (ie it refuses to answer properly because every year of the training data there was fewer or lower quality examples at the end of the year), or whether it's a cultural impression of humans talking about going on holiday/having a break etc in the training data at certain times and the model associating this with the meaning of "having a break".
I still wonder if we're building models wrong by training them on a huge amount of data from the Internet, then fine tuning for instruct where the model learns to make certain logical associations inherent or similar to the training data (which seems to introduce a myriad of issues like the strawberry problem or is x less than y being incorrect).
I feel like these models would have a lot more success if we trained a model to learn logic/problem solving separately without the core data set or to restrict the instruct fine tuning in some way so that we reduce the amount of "culture" it gleans from the data.
There's so much that we don't know about this stuff yet and it's so interesting to see something new in this field every day. All because of a wee paper on attention.
No comments yet
Contribute on Hacker News ↗