Comment by allovertheworld

2 days ago

Only because of Apples unified memory architecture. The groundwork is there, we just need memory to be cheaper so we can fit 512+GB now ;)

There's not in the end all that much point having more memory than you can compute on in a reasonable time. So I think probably the useful amount tops out in the 128GB range where you can still run a 70b model and get a useful token rate out of it.

Memory prices will rise short term and generally fall long term, even with the current supply hiccup the answer is to just build out more capacity (which will happen if there is healthy competition). I meant, I expect the other mobile chip providers to adopt unified architecture and beefy GPU cores on chip and lots of bandwidth to connect it to memory (at the max or ultra level, at least), I think AMD is already doing UM at least?