Comment by AstroBen

10 months ago

> Why doesn't it?

It was surprising to me because I would have expected if there was reasoning ability then it would translate across domains at least somewhat, but yeah what you say makes sense. I'm thinking of it in human terms

2 comments

AstroBen

famouswaffles 10 months ago

Transfer Learning during LLM training tends to be 'broader' than that.

Like how

- Training LLMs on code makes them solve reasoning problems better - Training Language Y alongside X makes them much better at Y than if they were trained on language Y alone and so on.

Probably because well gradient descent is a dumb optimizer and training is more like evolution than a human reading a book.

Also, there is something genuinely weird going on with LLM chess. And it's possible base models are better. https://dynomight.net/more-chess/

AstroBen 10 months ago

It seems to be fairly nuanced in how abilities transfer: https://arxiv.org/html/2310.16937v2
Very hard for me to wrap my head around the idea that an LLM being able to discuss, even perhaps teach high level chess strategy wouldn't transfer at all to its playing performance