← Back to context

Comment by russ

1 year ago

Which components feel ad hoc?

In most real applications, the agent has additional logic (function calling, RAG, etc) than simply relaying a stream to the model server. In those cases, you want it to be a separate service/component that can be independently scaled.

Essentially I think the Livekit value is a SFU that works, with signalling, and the SDKs exist. My experience is people radically overstate how hard signalling is, and underestimate SFU complexity, especially with fast failover.

In terms of being a higher level API arguably it is doomed to failure, thanks to the madness of the domain. (The part that sticks in my mind is audio device switching on Android.) WebRTC products seem to always end up with the consumer needing to know way more of the internals than is healthy. As such I think once you are sufficiently good at using LiveKit you are less likely to pick it for your next product because you will be able to roll your own far more easily. That is unless the value you were getting from it actually was the SFU infrastructure and not the SDKs.

The OpenAI case is so point-to-point that doing WebRTC for that is, honestly, really not hard at all.

  • You really don’t need to know about WebRTC at all when you use LiveKit. That’s largely thanks to the SDKs abstracting away all the complexity. Having good SDKs that work across every platform with consistent APIs is more valuable than the SFU imo. There are other options for SFUs and folks like Signal have rolled their own. Try to get WebRTC running on Apple Vision Pro or tvOS and let me know if that’s no big deal.

    • > Try to get WebRTC running on Apple Vision Pro or tvOS and let me know if that’s no big deal.

      [EDIT: I probably shouldn't mention that]. I have some experience of getting webrtc up on new platforms, and it's not as bad as all that. libwebrtc is a remarkably solid library, especially given the domain it's in.

      I obviously do not share your opinion of the SDKs.

      1 reply →