← Back to context

Comment by HarHarVeryFunny

1 month ago

I'm not aware of any architectures that have tried to put "brain-inspired" features into Transformers, or much attempt to modify them at all for that matter.

The architectural Transformer tweaks that we've seen are:

- Various versions of attention for greater efficiency

- MOE vs dense for greater efficiency

- Mamba (SSM) + transformer hybrid for greater efficiency

None of these are even trying to fundamentally change what the Transformer is doing.

Yeah, the x86 architecture is certainly a bit of a mess, but as you say good enough, as long as what you want to do is run good old fashioned symbolic computer programs. However, if you want to run these new-fangled neural nets, then you'd be better off with a GPU or TPU.

> By "AGI", I mean the good old "human equivalence" proxy. An AI that can accomplish any intellectual task that can be accomplished by a human. LLMs are probably sufficient for that.

I think DeepMind are right here, and you're wrong, but let's wait another year or two and see, eh?