← Back to context

Comment by englishm

15 hours ago

Hi! Cloudflare MoQ dev here, happy to answer questions!

Thanks for the award, kixelated. xD

> No head-of-line blocking: Unlike TCP where one lost packet blocks everything behind it, QUIC streams are independent. A lost packet on one stream (e.g., an audio track) doesn't block another (e.g., the main video track). This alone eliminates the stuttering that plagued RTMP.

It is likely that I am missing this due to not being super familiar with these technologies, but how does this prevent desync between audio and video if there are lost packets on, for the example, the audio track, but the video track isn't blocked and keeps on playing?

  • Synchronized playback is usually primarily a player responsibility, not something you should (solely) rely on your transport to provide. We have had some talk about extensions to allow for synchronizing multiple tracks by group boundaries at each hop through a relay system, but it's not clear if that's really needed yet.

    Essentially though, there are typically some small jitter buffers at the receiver and the player knows how draw from those buffers, syncing audio and video. Someone who works more on the player side could probably go into a lot more interesting detail about approaches to doing that, especially at low latencies. I know it can also get complicated with hardware details of how long it takes an audio sample vs. a video frame to actually be reproduced once the application sinks it into the playback queue.

  • If you’re delivering audio and video separately the blocking is irrelevant for needing to solve synchronization. That’s why some amount of buffering (a few frames of video at least) on the receiver is needed to hide the jitter between packets / make sure you have the video. You can go super low latency with no buffering but then you need to drop out video / audio when issues occur and those will be visible as glitches - depends on how good your network is.

  • Each track is further segmented into streams. So you can prioritize new > old, in addition to audio > video.

  • Depending on the streaming protocol (eg WARP), you can specify that the tracks (audio vs video) need to be time-aligned, so each group (chunk of video or audio) starts at the same time and lasts the same length. I think this means you'll get resync'd at the start of the next group.

Hi, I've got one.

Does your team have any concrete plans to reduce the TCP vs. QUIC diff with respect to goodput [1]? The linked paper claims seeing up to a 9.8% video bitrate reduction from HTTP/2 (TCP) to HTTP/3 (QUIC). Obviously, MoQ is based on a slightly different stack, so the results don't exactly generalize. I can imagine the problems are similar, though.

(I find this stuff fascinating, as I spent the last few months investigating the AF_XDP datapath for MsQuic as part of my master's thesis. I basically came to the conclusion that GSO/GRO is a better alternative and that QUIC desperately needs more hardware offloads :p)

[1]: https://arxiv.org/pdf/2310.09423

  • Good question! I can't speak concretely to our plans for optimizations at that level of the stack at this stage, but it's true that speaking broadly QUIC does currently lag behind some of the performance optimizations that TCP has developed over the years, particularly in the area of crypto where hardware offload capabilities can have a major impact.

    The good news is that there are strong incentives for the industry to develop performance optimizations for HTTP/3, and by also building atop QUIC, MoQ stands to benefit when such QUIC-stack optimizations come along.

    Regarding GSO/GRO - I recently attended an ANRW presentation of a paper[1] which reached similar conclusions regarding kernel bypass. Given the topic of your thesis, I'd be curious to hear your thoughts on this paper's other conclusions.

    [1]: https://dl.acm.org/doi/10.1145/3744200.3744780

  • QUIC implementations are definitely not tuned well in practice for 600Mbps flows on low latency, low loss networks, as the paper attests. But I don’t think almost any uses of video streaming fit that bill. Even streaming 4K video via Netflix or similar is tens of Mbps. In general if you don’t have loss or the need to rapidly establish connections, QUIC performance is not even theoretically better, let alone in practice.

    P.S. if there’s a public link to your masters thesis - please post it! I’d love to read how that shook out, even if AF_XDP didn’t fit in the end.

Hi! I have a few :)

How close are we to having QUIC actually usable in browsers (meaning both browsers and infrastructure supports it, and it 'just works')

How does QUIC get around the NAT problem? WebRTC requires STUN/TURN to get through full cone NAT, particularly the latter is problematic, since it requires a bunch of infra to run.

  • QUIC is already quite widely used! We see close to 10% of HTTP requests using HTTP/3: https://radar.cloudflare.com/adoption-and-usage

    As for the NAT problem, that's mainly an issue for peer-to-peer scenarios. If you have a publicly addressable server at one end, you don't need all of the complications of a full ICE stack, even for WebRTC. For cases where you do need TURN (e.g. for WebRTC with clients that may be on networks where UDP is completely blocked), you can use hosted services, see https://iceperf.com/ for some options.

    And as for MoQ - the main thing it requires from browsers is a WebTransport implementation. Chrome and Firefox already have support and Safari has started shipping an early version behind a feature flag. To make everything "just work" we'll need to finish some "streaming format" standards, but the good news is that you don't need to wait for that to be standardized if you control the original publisher and the end subscriber - you can make up your own and the fan out infrastructure in the middle (like the MoQ relay network we've deployed) doesn't care at all what you do at that layer.

    • Thanks for the answer!

      Unfortunately the NAT problem is more common than you think :( Lot's of corporate networks use full cone NAT (I know ours does), and so does AWS (if you don't have a public IP, but go through igw), so some sort of NAT punchthrough seems to be necessary for WebRTC.

      I wonder if WebTransport has its own solution to the problem.

      But I guess you can always rely on turn - by the way, does MoQ have some sort of ICE negotiation mechanism, or do we need to build that on top?

      1 reply →

  • Chrome and Firefox support WebTransport. Safari has announced intent to support it and they already use QUIC under the hood for HTTP/3.

    Cloud services are pretty TCP/HTTP centric which can be annoying. Any provider that gives you UDP support can be used with QUIC, but you're in charge of certificates and load balancing.

    QUIC is client->server so NATs are not a problem; 1 RTT to establish a connection. Iroh is an attempt at P2P QUIC using similar techniques to WebRTC but I don't think browser support will be a thing.

  • And while WebRTC solves some rather hard problems like P2P transfers, the beauty of DASH is that is can rely on existing servers and clients. So I am also quite puzzled on the comparison. Particularly as the post do not get into much detail on the path forward. I feel sometimes we are rather getting back to an AOL style Internet, that just connects dedicated clients to a CDN.

hi, I work on our webrtc streaming over at getstream.io

webrtc has a lot of annoying setup. but after it connects it offers low latency. how do you feel MoQ compares after the connection setup is completed? any advantages/ any issues?

  • QUIC/WebTransport gives you the ability to drop media, either via stream or datagrams, so you can get the same sort of response to congestion as WebRTC. However, one flaw with MoQ right now is that Google's GCC congestion controller prioritizes latency over throughput, while QUIC's TCP-based congestion controllers prioritize throughput over latency. We can improve that on the server side, but will need browser support on the client side.

    As for the media pipeline, there's no latency on the transmission side and the receiver can choose the latency. You literally have to build your own jitter buffer and choose when to render individual frames.

Is the load balancing of the relays out of scope? It doesn't seem to be addressed in the write up unless I missed it.

  • EDIT: Sorry I just noticed this was directed to Cloudflare. They're using the same architecture as Cloudflare Realtime, their WebRTC offering.

    `relay.moq.dev` currently uses GeoDNS to route to the closest edge. I'd like to use anycast like Cloudflare (and QUIC's preferred_address), but cloud offerings for anycast + UDP are limited.

    The relays nodes currently form a mesh network and gossip origins between themselves. I used to work at Twitch on the CDN team so I'd like to eventually add tiers, but it's overkill with near zero users.

    The moq-relay and terraform code is all open source if you're super curious.

    • Anycast can have serious reliability challenges. It was common at GCP for a small QPS user of anycast to have their Load Balancers nuked in a given pop as it was backed by a single machine. But BGP showed it as still the best route. The major DNS based offerings don't have such issues.

      2 replies →

    • Home much success have you have with GeoDNS? We've seen it fail when users are using privacy respecting resolvers like 1.1.1.1. It gets the continent right but fails on city/state level.

      1 reply →

  • I plan to cover more of the internal implementation details at a future date, possibly at a conference this fall..

    But I can at least say that we use anycast to route to a network-proximal colo.