← Back to context Comment by WithinReason 1 year ago no, you can give as much context to a transformer as you want, you just run out of memory 2 comments WithinReason Reply immibis 1 year ago An RNN doesn't run out of memory from that, so they are still fundamentally different.How do you encode arbitrarily long positions, anyway? WithinReason 1 year ago They are different but transformers don't have fixed windows, you can extend the context or make it smaller. I think you can extend a positional encoding if it's not a learned encoding.
immibis 1 year ago An RNN doesn't run out of memory from that, so they are still fundamentally different.How do you encode arbitrarily long positions, anyway? WithinReason 1 year ago They are different but transformers don't have fixed windows, you can extend the context or make it smaller. I think you can extend a positional encoding if it's not a learned encoding.
WithinReason 1 year ago They are different but transformers don't have fixed windows, you can extend the context or make it smaller. I think you can extend a positional encoding if it's not a learned encoding.
An RNN doesn't run out of memory from that, so they are still fundamentally different.
How do you encode arbitrarily long positions, anyway?
They are different but transformers don't have fixed windows, you can extend the context or make it smaller. I think you can extend a positional encoding if it's not a learned encoding.