Comment by toast0

8 months ago

Not my idea, but I supported it. Originally, client build scripts resolved the service names at build time, and that worked ok because our hosts tended to have a lot of longevity, and DNS tends to work, but things got a little better when we were more intentional about selecting the servers to be in the list, and keep track of which ones were in the list, so retirements could be managed a bit better. And I pushed until we got agreement on a set of FB load balancer IPs to include as well.

Nice. Thanks! Another peculiar thing I observed (way back when) was... in the most losiest of lossy EDGE/2G environments in rural India, only WhatsApp managed to work (email clients, browsers, other chat apps didn't). Not only was WhatsApp able to send/recieve messages but also upload/download ~100KB PDFs (over what seemed like a 20m to 30m slow process, but it did complete alright). If it is okay to disclose, did WhatsApp build its own protocol/impl atop TCP/UDP for such scenarios?

  • The EU marketplace disclosures on protocol seem to be pretty close to what I remember of the non-public protocols.

    Chat is basically binary encoded XMPP, with essentially a compression dictionary, so per iq overhead is minimal. Especially for the start of connection stuff (login, offline message delivery), we counted bytes and made accomidations for typical network issues we would see. Not acking a big chunk of offline messages after a few tries? Let's send one at a time and see if that works, etc.

    Our socket timeouts were rather long as well. Before the move into Facebook infra, servers were in the US only, and rural India is a long ways from the US; and last mile contention on 2G gets real rough out there too... I want to say timeouts were on the order of 30 seconds?

    Multimedia (attachments) was https, with resumption. I don't remember the full history, originally I don't think we had resumption on uploads, there's some coordination required for that, which IIRC started as more or less send an IQ that you want to upload a file with a hash of the file, and get a response of either what the download url is if the file was complete, or where to upload and what byte to start with if not. I think it's likely different now, but probably still https based. I wanted to move it so multimedia would be either multiplexed on the chat channel or using a similar protocol to the chat channel, but I didn't have the pull, and I got redirected into pushing TLS 1.3 into our Android client's mms upload/download instead; I didn't do the code there, just prototyping to show it could be possible, and then was more of a facilitator than a contributor. I'm not sure I got all the benefits I was looking for, but there were some, and it kept me busy while I was wrapping up our pre-FB hosting and my time at WA.