Comment by shepherdjerred
19 hours ago
Do you get anything useful out of your 4090 (I have one too)? Local cloud sounds like a fun idea but I just don’t see how it competes against OpenAI/Anthopic
19 hours ago
Do you get anything useful out of your 4090 (I have one too)? Local cloud sounds like a fun idea but I just don’t see how it competes against OpenAI/Anthopic
I think it’s not really worth it compared to just buying tokens or a coding plan.
My setup has 4090 handling attention while TT accelerators handles MLP. With just a 4090 you can have CPU handle the MLP layers and use a MoE model, assuming sufficiently powerful cpu. I tried that setup with minimax 2.5 before, and was able to eke out around 10 to 15 tps (albeit with a 7965wx cpu)