← Back to context

Comment by rkagerer

18 hours ago

it only manages one stream at a time

I'll take flak for saying it, but I feel web developers are partially at fault for laziness on this one. I've often seen them trigger a swath of connections (e.g. for uncoordinated async events), when carefully managed multiplexing over one or a handful will do just fine.

Eg. In prehistoric times I wrote a JavaScript library that let you queue up several downloads over one stream, with control over prioritization and cancelability.

It was used in a GreaseMonkey script on a popular dating website, to fetch thumbnails and other details of all your matches in the background. Hovering over a match would bring up all their photos, and if some hadn't been retrieved yet they'd immediately move to the top of the queue. I intentionally wanted to limit the number of connections, to avoid oversaturating the server or the user's bandwidth. Idle time was used to prefetch all matches on the page (IIRC in a sensible order responsive to your scroll location). If you picked a large enough pagination, then stepped away to top up your coffee, by the time you got back you could browse through all of your recent matches instantly, without waiting for any server roundtrip lag.

It was pretty slick. I realize these days modern stacks give you multiplexing for free, but to put in context this was created in the era before even JQuery was well-known.

Funny story, I shared it with one of my matches and she found it super useful but was a bit surprised that, in a way, I was helping my competition. Turned out OK... we're still together nearly two decades later and now she generously jokes I invented Tinder before it was a thing.

Sure, you can reimplement multiplexing on the application level, but it just makes more sense to do it on the transport level, so that people don't have to do it in JavaScript.

This is wonderful to hear. I have a naive question. Is this the reason most websites/web servers absolutely need CDNs (apart from their edge capabilities) because they understand caching much more than a web developer does? But I would think the person more closer to the user access pattern would know the optimal caching strategy.

  • Most websites do not need CDNs.

    CDNs became popular back in the old days, when some people thought that if two websites are using jquery-1.2.3.min.js, CDN could cache it and second site would load quicker. These days, browser don't do that, they'll ignore cached assets from other websites because it somehow helps to protect user privacy and they value privacy over performance in this case.

    There are some reasons CDNs might be helpful. Edge capability probably is the most important one. Another reason is that serving lots of static data might be a complicated task for a small website, so it makes sense to offload it to a specialised service. These days, CDNs went beyond static data. They can hide your backend, so public user won't know its address and can't DDoS it. They can handle TLS for you. They can filter bots, tor and people from countries you don't like. All in a few clicks in the dashboard, no need to implement complicated solutions.

    But nothing you couldn't write yourself in a few days, really.

  • Generally by default CDNs don't necessarily cache anything. They just (try to) respect the cache headers that the developer provides in the response from the origin web server

    So it's still up to the developer to provide the correct headers otherwise you don't get too much of a benefit

    That said some of them will do some default caching if it is recognized as a static file etc

[Not a web dev but] I thought each site gets a handful of connections (4) to each host and more requests would have to wait to use one of them. That's pretty close to what I'd want with a reasonably fast connection.

  • That's basically right. Back when I made this, many servers out there still limited you to just 2 (or sometimes even 1) concurrent connections. As sites became more media-heavy that number trended up. HTTP/2 can handle many concurrent streams on one connection, I'm not sure if you get as fine-grained control as with the library I wrote (maybe!).