← Back to context

Comment by handfuloflight

2 days ago

Okay. We've tried to define agents. Now let's try to define orchestration.

And make it more complicated than K8s

  • Not possible

    • The platforms I've seen live on top of kubernetes so I'm afraid it is possible. nvidia-docker, all the cuda libraries and drivers, nccl, vllm,... Large scale distributed training and inference are complicated beasties and the orchestration for them is too.