← Back to context

Comment by hansvm

2 months ago

> You seem to suppose that they actually perform addition internally, rather than simply having a model of the concept that humans sometimes do addition and use it to compute results. Why?

Nothing of the sort. They're _capable_ of doing so. For something as simple as addition you can even hand-craft weights which exactly solve it.

> The problem is that the question space grows exponentially in the length of input. If you want a non-coincidentally-correct answer to "how many t's in 'correct horse battery staple'?" then you need to actually add up the per-token counts.

Yes? The architecture is capable of both mapping tokens to character counts and of addition with a fraction of their current parameter counts. It's not all that hard.