Comment by versteegen
10 months ago
Define "by definition".
Because this statement really makes no sense. Transformers are perfectly capable (and capable of perfectly) learning mathematical functions, given the necessary working-out space, e.g. for long division or for algebraic manipulation. And they can learn to generalise from their training data very well (although very data-inefficiently). That's their entire strength!
No comments yet
Contribute on Hacker News ↗