Comment by HarHarVeryFunny
1 month ago
I'm not aware of any architectures that have tried to put "brain-inspired" features into Transformers, or much attempt to modify them at all for that matter.
The architectural Transformer tweaks that we've seen are:
- Various versions of attention for greater efficiency
- MOE vs dense for greater efficiency
- Mamba (SSM) + transformer hybrid for greater efficiency
None of these are even trying to fundamentally change what the Transformer is doing.
Yeah, the x86 architecture is certainly a bit of a mess, but as you say good enough, as long as what you want to do is run good old fashioned symbolic computer programs. However, if you want to run these new-fangled neural nets, then you'd be better off with a GPU or TPU.
> By "AGI", I mean the good old "human equivalence" proxy. An AI that can accomplish any intellectual task that can be accomplished by a human. LLMs are probably sufficient for that.
I think DeepMind are right here, and you're wrong, but let's wait another year or two and see, eh?
No comments yet
Contribute on Hacker News ↗