← Back to context

Comment by behnamoh

15 days ago

I have Apple Silicon and it's the worst when it comes to prompt processing time. So unless you want to have small contexts, it's not fast enough to let you do any real work with it.

Apple should've invested more in bandwidth, but it's Apple and has lost its visionary. Imagine having 512GB on M3 Ultra and not being able to load even a 70B model on it at decent context window.

Prompt preprocessing is heavily compute-bound, so relying significantly on processing capabilities. Bandwidth mostly affects token generation speed.