Comment by Sean-Der
2 hours ago
> use WebRTC and deploy selective forwarding units, which are going to be something custom
Would you mind explaining more? If you are doing WHIP/WHEP you should be able to drop in Broadcast Box/MediaMTX etc... and switch out servers and no one should notice. You can use browser/mobile/ffmpeg/OBS etc... get the same behavior. I care a lot about the broadcast space, want to learn about other problems.
> subtly speed up audio/video to keep everything in sync
You can use https://webrtc.googlesource.com/src/+/refs/heads/main/docs/n... to add more delay (if you want to force more buffering). Or if you don't link the media together (via MediaStream) you don't get the behavior you describe either!
> capture each participant's audio individually
That's a neat problem. I haven't solved this one myself, I wonder if it's easier with RtpTransport or insertable streams?
Regarding SFUs - with something like HLS, I can really easily scale up using something like a caching CDN (not entirely sure if that's the right term). But the idea goes: I can distribute the HLS media playlist, and have my media segment entries prefixed with a caching/CDN service. The service will be configured with the actual origin server, and when a segment isn't in the CDN, the CDN fetches from the origin, on-demand. That was a nice option when I was doing owncast streaming since I really only paid based on viewership, and just had to make sure I had the correct cache-related headers on my media segments.
Or alternatively - I can push media segments up to a CDN and distribute that way, using an s3-compatible service, or just rsyncing to a server with better bandwidth, etc. One thing I didn't care for - again back when I was broadcasting with Owncast - was that I needed to make sure old media segments were expired, otherwise I would rack up an insane bill. I had a 24/7 owncast stream and if you're not on top of expiring media segments with your CDN, it gets expensive fast.
The overall idea is - serving HLS is ultimately serving files and there's a good amount of tooling for that, right.
Now that you mention it, I think WHIP/WHEP can solve some of that. I just don't know of any service where I can have that same cache/CDN-like experience, of either having the CDN connect to the origin as needed and fan-out, or where I can push up and let the service distribute. (though - now I'm googling for "webrtc sfu as a service" and see that is a thing!).
Didn't know about the playout delay extension.
Whether capturing individual audio is easier with RtpTransport or insertable streams - I'm unsure. Possibly? I just figure since MoQ is going to rely on things like WebCodec/WebAudio there's hopefully a bit more control over what happens with audio as it comes in.
I'll admit though - I've started noticing how often podcasts are clearly recorded using something that doesn't allow per-participant recordings and, I'm guessing as long as the quality is good enough most aren't worrying about it.
EDIT: feel like I should mention Pion rules, I used it a few years ago to put together an SRT-to-WebRTC thing and RTMP-to-WebRTC thing to use with Janus Gateway, it was so easy.