Comment by handfuloflight

7 months ago

Okay. We've tried to define agents. Now let's try to define orchestration.

3 comments

handfuloflight

And make it more complicated than K8s

jliptzin 7 months ago
Not possible
- vajrabum 7 months ago
  
  The platforms I've seen live on top of kubernetes so I'm afraid it is possible. nvidia-docker, all the cuda libraries and drivers, nccl, vllm,... Large scale distributed training and inference are complicated beasties and the orchestration for them is too.