Je me souviens
Everything here is my opinion. I do not speak for your employer.
October 2009
November 2009

2009-10-29 »

Bittorrent, negative latency, and feedback control theory

Once upon a time, long long ago, I convinced mag to build a router-based automatic traffic shaper for Nitix based on control theory.

The basic idea was simple enough: the Linux kernel supports traffic shaping, which allows you to limit and control the amount of data you send/receive. Limiting the data you receive isn't all that useful, as it turns out, but limiting the send rate can be very useful.

If you transmit data at the maximum rate (say, 50k/sec), you'll end up filling your DSL modem's buffer, and then everything you transmit ends up having a multi-second delay, which results in horrendous latency.

If you transmit data at just slightly less than the maximum rate, say 49.9/sec, the buffer never fills up at all, and your latency is still the minimum. So it's not using your link that makes things unresponsive; it's overfilling the transmit buffer.

The problem: you don't actually know what your uplink rate is, so picking that 99% rate automatically isn't easy. That's why BitTorrent clients let you limit your uplink speed.

At NITI, we observed that latency creeps up right away when you exceed the maximum rate. So we ought to be able to detect that maximum rate by monitoring the latency and using that as feedback into a bandwidth limiter. Basically, a simple feedback control system.

This almost, but not quite, worked. It would in fact work great most of the time, but eventually it would always go into a crazy state in which it kept reducing the transmit rate without having any luck reducing the bandwidth... so it would reduce the transmit rate further out of desperation, and so on. The results made it basically unusable. Too bad. (We never had enough time to fully debug it... some other priority always got in the way.)

Moreover, it wasn't any use to you if you didn't have Nitix.

Anyway, all this is to say that the Bittorrent people have been thinking about the same problems lately, and have supposedly solved it as part of the uTorrent Transport Protocol (UTP). (There's also an IETF protocol called LEDBAT that seems to be related.)

Their approach is similar to what we were doing, but has a few changes that make it more likely to actually work.

First of all, they assume the "minimum achievable latency" is the lowest latency you've seen in the last 3 minutes. Rather than using averages, they observe that if the transmit buffer is always near-empty, then sooner or later you'll get a packet through without any buffer delay. The delay of that packet is the actual network latency; on top of that, anything extra is buffering delay.

Secondly, because they're coming up with a whole new protocol rather than throttling existing TCP sessions, they can add a timestamp to each packet. Basically, that means they can figure out the one-way latency without sending extra packets. Our system required sending out ping packets, which could only measure the full round-trip time (when really you need to measure each direction independently). They also know when they're transmitting at the maximum allowed rate and when they're mostly idle, so they can keep their statistics straight.

Furthermore, their approach focuses on the core of the problem: don't bother limiting overall upload throughput, just limit the rude part of the throughput. They've correctly noted that, almost always, when transmit buffers cause a problem, it's because of BitTorrent. Other than that, almost nobody uses much upload bandwidth at all. So they've limited their solution to only the BitTorrent protocol. That way they don't have to convince anyone else (router manufacturers, operating system kernels, etc) to support their standard.

Now, at last, BitTorrent can be polite. BitTorrent uploads are almost always the lowest-priority thing you could possibly be doing. So it's okay that it always loses out to the slightly-less-polite TCP. (Apparently TCP Vegas is a more polite version of TCP that would accomplish the same thing... if everybody used it. But it requires kernel support, and most kernels supposedly make you choose Vegas globally for all connections, not just for low-priority ones. Which you will never do, because it'll make your whole computer lower priority than everybody else's computers, and thus your personal Internet performance will suck.)

Negative latency and background transmissions

The ability to send data truly "in the background" without interfering with high-priority foreground communications is important. It allows you to implement what I call "negative latency" - transmission of data before anyone requests it.

Disks are getting bigger and bigger, and many computers spend a lot of time sitting idle on the Internet. During that time, they could be "pre-sending" data you might need later. If sending that data had no cost, then even if 99% of the data turned out to be useless, you'd still have a 1% improvement that is worthwhile. And personally, I think a much better than 1% success rate should be possible.

I'm looking forward to it.

I'm CEO at Tailscale, where we make network problems disappear.

Why would you follow me on twitter? Use RSS.

apenwarr on gmail.com