← Back to context

Comment by RKearney

10 years ago

I wouldn't call this obscure by any means. You'll find Ethernet flow control enabled on just about every datacenter network, especially those that have a combined network and storage fabric.

Some datacenters enable Priority Flow Control (PFC) which is different in that it pauses only the traffic with a specific PCP ( Priority in 802.1Q vlan tag ). They assign storage traffic a specific vlan priority and treat it as lossless with flow control but the rest of the traffic is unaffected.

The mechanism here Pause is an abomination which should never be enabled.

  • > Some datacenters enable Priority Flow Control (PFC) which is different in that it pauses only the traffic with a specific PCP ( Priority in 802.1Q vlan tag ). They assign storage traffic a specific vlan priority and treat it as lossless with flow control but the rest of the traffic is unaffected.

    I don't think he had his TV in a datacenter.

    From the article:

    > After some clever deductive reasoning, a.k.a randomly unplugging cables from the router, I determined that my TV was sending these mystery frames (yes, my TV — I have a Sony X805D Android TV).

    > The mechanism here Pause is an abomination which should never be enabled.

    What? You can't be serious, I think you have no idea what that would cause in almost every ethernet network. Let me tell you: a lot of packet loss that messes with TCP streams etc.

    L2 pause frames are used by practically all of the ethernet devices, and for a really good reason. Pause frames are a perfectly good way to do flow control in most networks. Not having them means a lot of lost frames and generic pain in most networks... except of course datacenters.

    Sure, it's not a standard. But it's good enough for 99.9% of use cases. Just maybe not in datacenter.

it's gotta be obscure if the Chrome Windows process stops responding, don't you think? (as the article reports).

Or is that just the Chrome team following the letter and the spirit of the law? "if they want a PAUSE by God they'll get one..."

  • Chrome or other applications won't be aware of what is happening way down below at layer 2. To layer 3 (TCP) the pause is indistinguishable from severe network congestion.

    Just speculating, but the stop/start oscillation in traffic rate could cause code running in chrome, such as a video codec, to exercise parts of its re-buffering code in a way that exposes a bug.

    • mplayer does same thing on streams when other side abruptly goes away, it hangs for ~5-10 second before closing.

Is there a concern that the mechanism allows for DoS? How do they mitigate the situation that the author describes?

  • If you have standard compliant hardware, pause is point to point, not broadcast. You can configure hosts to ignore pause and also to not generate it; although it may be difficult to configure an embedded device, so you probably need to fix the switch or replace it with something that works.

  • Accidental pause frame DoS has been observed in the wild and AFAIK there's no known solution.