← Back to context

Comment by D-Machine

16 days ago

Theoretical, infinite-width, single-layer MLPs are universal approximators: modern models that actually exist are not.

And modern transformers definitely underperform models with built in priors (e.g. CNN) when they don't have massive amounts of data. Nevermind that LLMs simply can't at all handle all sorts of data types https://news.ycombinator.com/item?id=46948612.

Just another example of an HN commentator making statements about something they don't have any actual basic understanding of. Try reading some actual papers instead of the usual blog posts and marketing spam from frontier AI companies, you might learn something important.