Comment by Eridrus

1 year ago

Starting with the reversal curse is weird since there is a simple workaround to this, which is to identify entity names to keep them in their proper order, and then train on the reverse of the pretraining corpus: https://arxiv.org/abs/2403.13799v1

You can argue about how this doesn't really say anything surprising since the reversal of "A is B" is literally "B is A", but it's weird to expect elegant solutions to all problems on all fronts all at once, and we do have an incredibly simple data generation process here.