Comment by tsimionescu

1 year ago

If the networks are to ever be a path to a closer to general intelligence, they will anyway need to be able to ask for context to be repeated, or to have separate storage where they can "choose" to replay it themselves. So this problem likely has to be solved another way anyway, both for transformers and for RNNs.

2 comments

tsimionescu

famouswaffles 1 year ago

For a transformer, context is already always being repeated every token. They can fetch information that became useful anytime they want. I don't see what problem there is to solve here.

tsimionescu 1 year ago

For a transformer, context is limited, so the same kind of problem applies after you exceed some size.