I love the smell of

in the morning
Everything here is my opinion. I do not speak for your employer.
January 2011
February 2011

2011-01-14 »

bufferbloat vs. wireless networks, and other stories

It seems my previous article on bufferbloat struck a chord with some people. Most people, quite rightly, didn't know what to do with it and thought it was super confusing. (It was. It was rambly and poorly edited, and reading the responses mostly reminded me how disorganized all my thoughts on the topic were. Oops. It turns out disappearing into a hole and studying something for a month straight can quickly make you lose perspective on what the average person does/doesn't know about TCP/IP and traffic shaping. Sorry about that.)

On the other hand, some people actually think there's useful information hidden in there, which there is. Other people thought I was just wrong, which I'm not; that article, confusing as it was, isn't conjecture, it's a disjointed disorganized overly-summarized report of my actual careful experiments and observations.

Now, vague inexact reports of my careful observations are just as useful to you as totally wrong conjectures, so I apologize for that. Perhaps I'll try again later.

Meanwhile, I do want to correct a couple of misconceptions in particular, just in case people did believe me, because there are some things I wasn't claiming because they aren't true and/or I don't know if they're true.

Wireless networks have problems, but they are not these problems

My article was specifically about a very simple network topology that looks like this:

  computers -> wires -> router -> DSL/cable/etc -> internet.

I didn't talk about wireless anywhere in the article, somewhat on purpose, because wireless has its own problems and those new problems confound your ability to do straightforward experiments to see the original router problems I was talking about.

Now, it's safe to say that if your wireless speed is much much more than your router speed, and all your wireless traffic is internet traffic (ie. not talking to other wireless devices on your LAN), then everything in my article remains true. Any wireless problems are dwarfed by problems caused by your router's crappy queue, and can be solved by doing traffic shaping correctly.

TCP is smart, so it knows not to send data any faster than your router can handle. Since that's much slower than your local wireless network, the only queue that ever fills up is the one at the bottleneck: at your router or the ISP's router on the other end. All the boxes on your WLAN will have essentially empty queues, so it doesn't matter how their maximum queue length is configured.

However, let's imagine that this isn't the case on your network; nowadays you might have 20 megabit cable, or 50 megabit FIOS, or whatever, which means your Internet connection is actually faster than your wireless network. Or maybe your Internet connection is a long range shared wireless network, in which case your wireless speed is obviously not much much greater than your router's speed, as we had been assuming. Or maybe you're just connecting two computers together on your LAN, with no router involved at all. What happens then?

What happens then is, well, a whole different story than I had been talking about. And so of course the solution will have to be totally different.

Now, as it happens, I've been thinking about creating a new startup to build the ultimate wireless router - one with built in traffic shaping that works out of the box. So while I don't have as much concrete test data for wireless as I do for wired routers, I do have quite a bit of information that comes from studying the situation in theory and a bit of practice.

Here are some things you need to know:

First, various operating systems and drivers have oversized network buffers on their wireless card drivers (and on their wired ethernet drivers too, in some cases). Since your internet router is no longer the bottleneck, suddenly the client machine's buffers are what matter, and those buffers are oversized. So yeah, if you start a big transfer, your interactive performance will suck. Solution: reduce the buffer size. Or install traffic shaping on your client machine, but that's a bit silly since it's your own LAN, so you should just fix the buffer size.

If you're downloading from a really dumb local server that has oversized transmit queues, same problem. You should get the server admin to lower the queue size. I guess your admin might be mean and refuse to do that, in which case you can improve your personal interactive performance when communicating with that server by installing traffic policing (downstream) on your client, but that's gross and fiddly.

Incidentally, traffic shaping/policing on a wireless network is totally disgusting because the available bandwidth is variable. How do you set the rates to 99%/80% of the bandwidth when the bandwidth keeps changing? You can't, at least not without having a direct line to the network driver to find out the current bitrate. And anyway, it's all stupid, because the right thing to do is simply lower the driver's tx queue size. Do that.

Next, rumour has it that if one machine starts blasting data all over your wireless network at top speed, this will hurt performance of other machines on the same WLAN. I personally have never experienced this, so warning: I'm now talking theory, not experimental evidence. But if it were to happen, I claim that this can't possibly have anything to do with the queue size of any of your devices. Why? Because no matter how big its queue, the wireless device on any computer can't send faster than its maximum transmit rate. A shorter or longer queue affects latency on the machine with the queue, and on any machine forwarding through that machine, but to everyone else, it just looks like the same packets coming out at the same speed.

(If your problem is that one machine on your wireless network starts blasting tons of data and overwhelms your router, well, yeah. Your router queue is too big. But if it overwhelms your router, we're in the case we talked about last time, not the current case where your router is too fast to ever be overwhelmed.)

Now, just because the problem isn't caused by your queue length doesn't mean I'm claiming it doesn't exist. There are lots of reasons the problem might exist, especially on a wireless network.

First of all, 802.11 wireless networks retransmit packets when they get dropped. This sounds obvious - doesn't everybody just retransmit stuff? - but no, it's very strange. In fact, ethernet doesn't retransmit stuff. If it sends a packet and the packet is lost due to noise or because the receiver has no buffer space, then tough. The packet is lost. It's TCP's job to retransmit it.

802.11 wireless networks don't work the same way. The reason is this: TCP depends on information about how many packets are lost and when. TCP assumes it's on a wired network, which is generally reliable, so when a packet is dropped, it's almost always because the receiver (or actually, any router along the path) has no more buffers. If that's true, it means the router is overwhelmed, so TCP needs to slow down... which it does. And that's how the whole internet works! That's why you don't need to know the speed of my DSL modem in order to send data to my DSL modem at exactly the speed my DSL modem can tolerate; no more, and no less. It's very cool.

Anyway, wireless networks are way less reliable than wired ones, because wireless networks are much more susceptible to random noise. So packets get dropped all the time, and it's not because you're sending too fast, it's because you're unlucky. Thus, the 802.11 standard calls for devices to retransmit packets that were lost at the physical layer. They can still be dropped by the receiving device, but only after they've been acknowledged by 802.11, which means the transmitter knows you got them. So in 802.11, there are two kinds of packet loss: the kind you're supposed to fix (noise), and the kind you aren't (buffer full).

Generally it's the driver's job to do the 802.11 retransmits. This calls into question: how many times do you retry transmitting? How long between retransmits? What if the noise was caused by someone else's retransmits, and we get into lockstep, constantly ruining each other's packets? On the receiving end, how and when do we report which kind of receive failure is which?

Answer: it's complicated. But it's obviously easy to do wrong. I heard a rumour that some Linux drivers will retry the same packet up to 256 times in a row if it doesn't work; that seems rather desperate to me. I can also imagine bugs where a driver, finding the kernel's receive queue is full, reports the 802.11 packet as corrupted ("please retransmit") rather than as delivered-but-then-dropped. If it did that, it would utterly destroy TCP's behaviour and potentially clog the network really badly. Maybe this bug exists; I have no idea, I've only heard rumours. If it does, then it would surely be a problem. It would also have nothing to do with queue length, and could not be solved by any form of traffic shaping.

Which brings up another really important point: the whole Internet actually only works at all because of a truly amazing level of voluntary cooperation. Anybody anywhere could deliberately mis-tune their TCP layer and get faster downloads and uploads than anybody else, at the cost of messing up things for everybody else. Similarly, a WLAN driver that mis-reported dropped packets could get much higher thruput than anybody else on the WLAN, at the cost of screwing over everybody else.

TCP is kind of like the "prisoner's dilemma" - as long as we all cooperate, we all win, but if any of us defects, we might as well all defect, because the party is over. And as with experimental tests of the "iterated prisoner's dilemma" in real life (where you play the game with your partners over and over, rather than just once), we have mostly converged on not cheating. Because if we didn't, someone else would cheat too, and the party would be over.

Overly long transmit queues are not defection, by the way; they're just stupid. You don't get better performance at other people's expense by using overly long queues. You just give yourself worse performance at your own expense. Nice work, genius.

So that's not this. And to repeat myself, queue length is also not what would cause one wireless device to steal bandwidth from another wireless device. 802.11 retransmit problems? Yeah, that could cause problems all right.

Now, even if we assume our 802.11 drivers don't have bugs (other than queue length, which doesn't screw anybody but the perpetrator), which might even be the case, there are still some obscure reasons that one station's WLAN traffic might stomp on another's. Here are a couple that I know about; there may be more.

1) Ethernet-style collision detection (which is used, sort of, sometimes, in 802.11) is prone to timing problems; the more people you have on your network, and the faster the network, the more collisions you'll have. Background: in original 10MBit ethernet, wire lengths were limited so that the first byte of a packet A would reach all nodes by the time the last byte had been sent out. It was still possible that a second node would start transmitting its own packet B between the first node starting to send A and the second node start to receive A; however, the timing was arranged so that both A and B would know by the end that they had stomped on each other, so they could wait for a random delay and then retransmit.

Unfortunately, with 100MBit and faster networks, this collision detection got a lot worse, because the same packet was 1/10th the "length" (if you think of a packet travelling through a wire as a wave moving at the speed of light, the distance from start to end is the "length" of the packet), and that length was so short that it wasn't really reasonable to require nodes to be physically so close together. That's why hubs were sensible in the 10MBit days, but rapidly fell into disuse in the 100MBit days in favour of switches, which don't need collision detection because they just give every node its own wire.

With wireless, you have no choice: everybody is on the same "wire", ie. the airwaves. You can sort of help by using multiple channels, which 802.11 does, half-heartedly. (There are only 3 non-overlapping channels in 802.11's North American standard, at least, which means three packets at a time in an optimistically configured topology.) There are interesting mathematical improvements on this (eg. frequency hopping and ultrawideband transmission), but 802.11 doesn't use them; cell phone networks do, nowadays, because they have so darned many people using them at once that nothing else would suffice.

Anyway, with 802.11, you're back to old-fashioned collision detection, only now you have outside noise, uncontrolled distances, random signal reflection, and fast transmit rates. It's awful, and frankly, I'm amazed it works at all.

The collision detection algorithms have improved somewhat since the ethernet CSMACD (carrier-sense multiple access collision detection) days, but not much. And the various 802.11 standards - I haven't read them all yet - seem to define multiple different ways around the problem. One of them is actually, ha ha, a token-ring like scheme: the wireless access point assigns timeslots to each node, and the node is only allowed to transmit during its timeslot, thus avoiding the possibility for collisions (while also bringing back all the bad stuff about token ring that killed it in the first place). I'm a bit fuzzy on when this mode is enabled and when it's not; I think it's mostly not, so this is more academic than useful.

So anyway, collision detection is hard. And the more you transmit, the more collisions you'll have. The more different nodes you have transmitting, the more collisions you have. The more your driver's auto-retransmit algorithm is broken, the vastly more collisions you'll have. And so on. One busy station with a wrong retransmit algorithm could really ruin it for everyone. Of course, like I said, I have no evidence that such a thing exists; I've just heard rumours.

2) This one is even more insidious. It's called the "hidden node problem." It works like this: station A can talk to the access point, and station B can talk to the access point, but A and B are on opposite ends of the building, so they're too far away to see each other. So let's say they both decide to start blasting data at the access point at the same time at full speed.

No matter what collision detection method they try to use, they'll never know that every single one of their packets is colliding with the other guy! As far as the access point is concerned, it's seeing a nonsense combination of signals from A and B. As far as A and B are concerned, all is well,1 except for some reason the access point isn't acknowledging any of their packets, so they have to retransmit.

If you're running a long-range wireless network, your access point probably has a really great well-positioned antenna, and the other stations probably don't. That means you probably have the hidden node problem in a big way. That means tons of horrible unexplained packet loss, which means tons of retransmits. And, rumour has it, those retransmits might be compounded by buggy retransmission algorithms. As the operator of such a network, your life, in short, sucks big time.

I don't have any experience running long-range multi-user wireless networks, although I have fantasized about doing so, in what now appears to be a rather obsessive-compulsive level of detail. And the hidden node problem is where my fantasies turn into nightmares.

I don't know the solution to this one. I do know three things that might be potential solutions, but they'd need to be thoroughly tested:

a) That 802.11 timeslot mechanism I mentioned above, if it actually was turned on and supported. (I read about it in the spec, and got the distinct impression that it was a great idea that wouldn't do me any good because I couldn't use it. But I now don't remember why. I think it's part of the 802.11 QoS spec, and even if routers support that, none of the standard workstation drivers do.)

b) Some projects (frottle, WiCCP) that actually do timeslicing at the user level. At least frottle has some user reports that say it dramatically improved results. However, it requires everyone on your network to run special software; if anybody isn't, they'll probably stomp on all the packets from everybody else.

c) An slightly insane specialized application of traffic policing at the access point. This requires that everybody talks only to the access point directly, never to other nodes on the WLAN, which is a pretty safe assumption since the hidden node problem means they can't see each other anyhow. Basically, if you do things exactly right, you should be able to slow down everyone's traffic so they all get 1/n of the total bandwidth, and you should be able to reduce traffic burstiness so those packets mostly statistically end up not colliding. If it worked, it would be a totally amazing hack, because you wouldn't have to fix the software on any of the stations, and you wouldn't have to use explicit timeslots; the timeslots would be implicit based on your traffic policing algorithm.

And no, the transmit queue length has no bearing on this problem. In fact, somewhat entertainingly, activating policing on the access point's WLAN link could actually force the transmit queues to be shorter (since the TCP window size would be shorter since policing is dropping so many packets), thus solving that problem too.

Assorted other clarifications about my last article

While we're here, note that: - the claims in my previous article don't depend on bandwidth; they are true at fast and slow speeds, you just use different constants when doing the math. - 256 kbits (with excessive buffers) is a fun example because it makes all the problems worse; you can't half-ass your traffic shaping or it just won't work. That means experiments are easier on that link than on a good link, which is why I took advantage of it during my sublet back in September. - My claim of 250ms of latency under high load needs explanation: that 250ms is configurable and asymmetric. Basically, all the latency was downstream, because incoming traffic (policing) is much harder to control than outgoing traffic (shaping). And you can lower the latency by lowering your downstream bandwidth limit, or raise the latency by raising the downstream bandwidth limit; it's a tradeoff. I talked about 250ms and 80% downstream because that was the tradeoff that worked for me with 10 sessions and Skype on a 256k link. - I "dialed in" 250ms because that's about the limit of bearability for interactive ssh traffic. No other reason. I could have picked basically any number.

I doubt all that is very helpful. But that's the best I can do for now.

Footnote

1 You can simulate the hidden node problem at home while doing the dishes. Go stand by the kitchen sink and turn on the tap. Now talk at a normal volume to someone across the room, and have them talk back to you at a normal volume. To them, you'll be audible; what they hear is about a 1:1 ratio of tap noise to voice, which is good enough to understand. But to you, they'll be inaudible; by the time their voice reaches you, it has decayed a lot (ie. by the square of the distance between you and them), but the water noise is full blast, totally overwhelming their voice.

I'm CEO at Tailscale, where we make network problems disappear.

Why would you follow me on twitter? Use RSS.

apenwarr on gmail.com