Comment by tptacek

3 years ago

In fairness, I don't know if we kept the default. I'm responding to two independent things at this point: first, there are definitely systems where 200ms delays have rippling impacts, and second, leader elections aren't always benign.

(Consul would, I'm sure, converge eventually regardless of the election frequency, but that doesn't mean everything that relies on Consul will tolerate those delays).

I don't have much of a take here, beyond that I don't think you can extrapolate as much from what's on the 6.824 pages as you might have done here. Certainly, in a system where 200ms is the difference between "healthy" and "not healthy" status on a peer relationship, I'd think you'd want Nagle disabled. But I haven't thought carefully about this, or looked that closely at the typical packet flow between Consul nodes. I could be wrong about all of this; more reason not to give me any money.

Later

Per the comment upthread, I haven't even bothered to check which parts of this packet flow are even TCP to begin with.

4 comments

tptacek

pclmulqdq 3 years ago

I've never directly used Consul's internals, but I'm guessing it uses Stubby, which is built on top of TCP.

tptacek 3 years ago
It does Serf over UDP, but I get fuzzy on the integration of Serf and Consul.
- jen20 3 years ago
  
  Raft and the Consul RPC API use TCP, Serf uses both TCP and UDP.
  While the Consul RCP API may have grown options to use GRPC (I forget now), Raft uses length-prefixed msgpack PDUs.
- pclmulqdq 3 years ago
  
  Whoops, I thought this was a Google product, given the discussion. Stubby is basically GRPC internal to Google.