Comment by vercaemert

12 hours ago

I'm suprised there isn't more "hope" in this area. Even things like the GPT Pro models; surely that sort of reasoning/synthesis will eventually make its way into local models. And that's something that's already been discovered.

Just the other day I was reading a paper about ANNs whose connections aren't strictly feedforward but, rather, circular connections proliferate. It increases expressiveness at the (huge) cost of eliminating the current gradient descent algorithms. As compute gets cheaper and cheaper, these things will become feasible (greater expressiveness, after all, equates to greater intelligence).

2 comments

vercaemert

bigfudge 9 hours ago

It seems like a lot of the benefits of SOTA models are from data though, not architecture? Won't the moat of the big 3/4 players in getting data only grow as they are integrated deeper into businesses workflows?

vercaemert 9 hours ago

That's a good point. I'm not familiar enough with the various moats to comment.
I was just talking at a high level. If transformers are HDD technology, maybe there's SSD right around the corner that's a paradigm shift for the whole industry (but for the average user just looks like better/smarter models). It's a very new field, and it's not unrealistic that major discoveries shake things up in the next decade or less.