← Back to context Comment by Dylan16807 7 hours ago Depends on what you're doing. I'm pretty sure the bandwidth for inference isn't much. 1 comment Dylan16807 Reply eurekin 1 hour ago Depends, if it's tensor parallel or pipeline parallel. Only PP doesn't pass too much. TP does
eurekin 1 hour ago Depends, if it's tensor parallel or pipeline parallel. Only PP doesn't pass too much. TP does
Depends, if it's tensor parallel or pipeline parallel. Only PP doesn't pass too much. TP does