← Back to context Comment by jaggederest 20 hours ago 35b A3b runs ~100 tokens a second on the best M5 Max gpu setup. 1 comment jaggederest Reply ctkhn 7 hours ago I got around 50-60 on my m3 max so 100tps seems very realistic for 2 gens later of chip and double the ram
ctkhn 7 hours ago I got around 50-60 on my m3 max so 100tps seems very realistic for 2 gens later of chip and double the ram
I got around 50-60 on my m3 max so 100tps seems very realistic for 2 gens later of chip and double the ram