201604 - apenwarr

2016-04-01 »

So last night I was thinking, hey, I work at a big company. People at big companies make their own TCP congestion control algorithms. Therefore, I can make my own TCP congestion control algorithm. And boy was I right!(!)

My contribution is TCP THERMAL (in all caps because the best congestion algorithms are named in all caps). It's named after the fact that if you use it, it will set your network on fire. Also, its tcp traces (pictured here) look a bit like what I imagine thermal noise looks like, if you could visualize it, and your visualization tool involved feeding said thermal noise into the cwnd estimator of a TCP flow.

Anyway, check out this magnificent concoction. Not a single byte of rwnd wasted! (The green dots are SACKs, and the white x's are transmitted or retransmitted packets. The top and bottom lines show the how the receive window progresses.)

...So okay, anyone familiar with TCP traces knows that this looks absolutely awful. But there's a method to my madness. What I actually did was I disabled congestion control entirely(!!), which is a stupid thing to do in the general case, but can actually help if you're on a local network, there are few traffic competitors, you don't want to compete fairly, you have a self-limited-rate stream (like video), and you have a horrible level of packet loss and latency. In this example, I made a bad 1x1 2.4 GHz wifi link with added 10% loss and 10 +/- 10ms of added latency. The idea is to make wireless TV work really well on really poor links.

In my test, THERMAL can still extract the benefit of about 60% of the link, while CUBIC collapses to less than 10%. (This picture is a worst-case from just running iperf at maximum speed. If you're streaming below the maximum, no, it doesn't look nearly this dumb.)

(!) I didn't say I could make a good TCP congestion control algorithm.

(!!) No matter what I do to the congestion control plugin, Linux's TCP still wants to back off the cwnd when it is retrying due to packet loss; I guess retries don't count as "congestion control" once you get past a certain desperation point. With high latency and loss, this results in very disappointing totally empty periods in the transmission (not shown here), where there is plenty of rwin left and we could be blasting future data even while waiting for the last few retries of previously-sent data. I can't figure out a way to disable this behaviour without modifying the kernel proper rather than just adding a new congestion module. Did I miss something?

2016-04-04 »

"When I think stability of the banking system, I think Avery. Just the kind of calm, measured thinking the financial industry needs."
– pmccurdy, possibly sarcastically

2016-04-05 »

"It's not the source of truth, it's the destination of truth."

2016-04-06 »

Some people would say that "Not breaking things often enough is a sign of not pushing hard enough." Judging by my performance this week, I guess that's one less problem I have. So I have that going for me, which is nice.

2016-04-07 »

Followup from the netdev1.1 conference in Seville: looks like we kicked some people into finally getting some of their low-latency patches going. Dave Taht analyzes Michal Kazior's work getting fq_codel working with per-station queues on ath10k, and the results are pretty great: http://blog.cerowrt.org/post/fq_codel_on_ath10k/

Still some latency trouble though, particularly when mixing low- and high-signal strength stations.

2016-04-08 »

I think I mostly-ish calculated this right-ish. Based on real-world gattaca test data. Problem is, I don't know how much to extend the x axis to the left now (where there was no data before because we couldn't hear anyone, but with a repeater we can). I could fake it by moving around our existing data, I guess.

2016-04-09 »

Sitting in a restaurant in New York.

Overheard: "It's amazing how many gigs he gets."

False alarm though. Turned out they weren't talking about Internet speeds.

2016-04-10 »

Using Newton-Raphson method to calculate what my W-4 exemption amounts should be for 2016. That method makes little sense, but the official method didn't work in any previous year, so...

2016-04-11 »

I learned next to nothing from this graph. And yet.

2016-04-12 »

When you google for "Linus Torvalds", the main picture in his funbox appears to be him sitting in a fighter jet. I approve.

2016-04-14 »

Today I realized that not only do I never get to say, "snatching victory from the jaws of defeat" anymore, it has actually started sounding awkward compared to the new cliche, "snatching defeat from the jaws of victory."

modern tech really gives me a lot of opportunities to say the latter.

2016-04-17 »

40% packet loss and my toy TCP "semisonic", a port of someone's hack to our slightly older TCP stack, with some additional lobotomization to make the RTO timer back off less. Unlike my previous post about TCP THERMAL, this one doesn't fill up the entire receive window, which means it doesn't overfill queues and cause unnecessary increase in latency. Interestingly, by keeping latencies lower, we can retry lost packets sooner, getting similar throughput with less socket buffer memory required on each end.

On the other hand, Linux's (and maybe everybody's) RTO backoff is ridiculously way too aggressive to work well in high-loss environments. Without my changes, the traces had multi-second periods with no data being sent at all, which is obviously not great for live video. People have spent a lot of time tuning congestion control in the "normal" TCP flow case, but seemingly not in the backoff case. I wonder if BBR fixes that code path too.

2016-04-18 »

The bad news: I'm pretty sure nobody but me cares about, has ever cared about, or will ever care about the weird and insane details of this thing I'm trying to do.

The good news: maybe that's how you get promoted.

2016-04-21 »

"A bit of a tangent, but it reminds me of how Apple loves to say in their announcements that whatever product is 'the best X we've ever done.' Well of course it is, why would you make a new thing that's worse than last year's?"
– a commentor on ycombinator news

Ha. How many teams can claim this with a straight face?

2016-04-23 »

Edward Snowden has some suggestions for how guest wifi should work:

"""
Instead of changing your phone to change your persona — divorcing your journalist phone from your personal phone — you can use the systems that are surrounding us all of the time to move between personas. If you want to call a cab, the cab doesn’t need to know about who you are or your payment details.
You should be able to buy a bottle of Internet like you buy a bottle of water. There is the technical capacity to tokenize and to commoditize access in a way that we can divorce it from identity in such a way that we stop creating these trails. We have been creating these activity records of everything we do as we go about our daily business as a byproduct of living life. This is a form of pollution; just as during the Industrial Revolution, when a person in Pittsburgh couldn’t see from one corner to another because there was so much soot in the air. We can make data start working for us rather than against us. We just simply need to change the way we look at it.
"""
– Edward Snowden
in http://www.popsci.com/edward-snowden-internet-is-broken

"pretend to care"
"incessantly nagging people"
"Not getting fired is not Avery's career goal"
"Avery will file a bug so you can adjust your expectations"

Thanks! Y'all are the best, as always.

(As everyone here knows by now, I'm famous for my optimism. So the good news is I could only find four quotes worth taking out of context.)