Comment by jasonjmcghee

8 days ago

What's the structurally simplest architecture that has worked to a reasonably competitive degree?

Competitiveness doesn't really come from architecture, but from scale, data, and fine-tuning data. There has been little innovation in architecture over the last few years, and most innovations are for the purpose of making it more efficient to run training or inference (fit in more data), not "fundamentally smarter"

If your definition of "competitive" is loose enough, you can write your own Markov chain in an evening. Transformer models rely on a lot of prior art that has to be learned incrementally.