It's a bird, it's a plane, it's

Everything here is my opinion. I do not speak for your employer.
August 2014
October 2014

2014-09-01 »

If you try to rotate the wifi encryption keys, you're screwed because of bugs.  If you don't try to rotate the wifi encryption keys, then you don't know if you're screwed.

2014-09-02 »

In celebration of fixing bugs in our platform so that it would pass trivial tests so that we could put those trivial tests in a new integration testing framework that we needed to design so that we could make sure wifi (and other) problems didn't recur so that we could make bigger changes to the wifi layers so that we could more safely enable inter-access-point coordination so that we could auto-disable wifi on the TV box when it was in range of the network box so that we could finally enable TV box wifi by default so that customers could have fast wifi in every room of their house,

<phew>

Denton presented us with these miniature Cthulhus made from actual yak hair.  I am in awe.

Just this once, I think it's safe to say, don't worry, the yak has already been shaved.

2014-09-05 »

java:14621: error: code too large for try statement

2014-09-06 »

"People who are too much like me should not be defining culture.  That's just not how these things are supposed to work."
   – me, trying to explain the Silicon Valley region

2014-09-09 »

Essentially, our "wifi lab" is now spilling out of our cubicle space officially, rather than unofficially.

2014-09-12 »

Years ago, I used to own one of those Casio calculator watches.  I was quite enamoured with it. While I owned it, I found many, many opportunities to do calculations.

Eventually it broke and I got a new, normal watch.  For a while I was sad.  But then I realized that somehow I didn't need to do so many calculations anymore.

2014-09-13 »

That sinking realization that there is, as usual, a crackpot at this meeting... but this time it's not you.

Followed quickly by the realization that for some people, this must happen all the time.

I have newfound sympathy for y'all.  Ouch.  What a way to live.

2014-09-17 »

"The average U.S. household spends some $5,000 
per year for their combination of cable, telephone, broadband, 
and cellular services. The typical family wireless bill often 
exceeds $200 per month."
  – http://www.wififirst.org/wififirst-ebook.pdf

That doesn't sound right to me.  The average U.S. family has a pretax income of $51k/year last time I checked.  They're spending $5000 of their after-tax on connectivity?  I don't spend anything close to that and I feel like I'm doing pretty well for myself.

2014-09-18 »

Major Unix mistake #2 (so far a pretty small number, IMHO):

File descriptors really, really should have defaulted to close_on_exec.

2014-09-19 »

Mentoring junior engineers is kind of like building a particle accelerator out of tin cans.  Clunky and unscientific, and mostly it doesn't work, but if you get them going fast enough sometimes they veer wildly off course and collide with something and there are explosions!  And then you learn something.

2014-09-23 »

I pushed for us to use only open source wifi drivers in our product because I had heard the proprietary drivers have highly questionable hacks, which improve ideal-world isolation-chamber benchmark conditions at the cost of real life performance.

Not too surprisingly, now that we're using open source drivers, we start getting patches from $VENDOR to do exactly those kinds of hacks to the open source drivers.

Today's awesome example: a patch that, after the first 1000 ACKs in a given TCP session, starts dropping all but every 3rd ACK.  This is because return TCP ACK traffic is a significant fraction of your airtime, so things can go quite a bit faster if you have fewer of them.

Sending fewer TCP ACKs is actually a real area of research.  But just blindly doing it after 1000 ACKs is I'm sure not what the TCP people had in mind.  This optimization is pretty much a pure speed improvement for iperf in great network conditions, but is likely to severely confuse TCP if there's any kind of loss or variable latency.

...of course, people don't test for loss and variable latency when they're choosing wifi vendors.  We sure didn't.

2014-09-24 »

Achievement unlocked: I'm tangentially featured in a twitter thread that suddenly involved Marc Andreesen.

https://twitter.com/pmarca/status/514932141659271168

The quote he's referring to isn't mine, and he's making fun of it, but it's pretty cool when taken out of context the way he did: "because it's an impractical dream that's peripheral to where the important problems are."

Personally I usually aim toward impractical dreams centered right on top of the important problems.  But I can also see the argument that today's peripheral areas are tomorrow's revolutions.

2014-09-25 »

Pretty sure I dodged that nasty bash CGI security hole on my web server.

BY USING PHP.

2014-09-26 »

I really, really liked this article.  It introduces a particular counterintuitive coding style which is rarely useful, but when it is useful, it makes a huge difference in debuggability and cleanliness.

There is no one style that is right for every situation.

http://number-none.com/blow/john_carmack_on_inlined_code.html

2014-09-27 »

This one line from wikipedia pretty much says everything about my day so far.

"As soon as the PTK is obtained it is divided into five separate keys:"

2014-09-28 »

Hey, I finally just figured out why there is no 802.11 "pre-association" feature for normal WPA2-PSK, even though this exists if you're using "WPA2 Enterprise" aka EAP.

It's because the latter might take a long time and so it's worth doing in advance.  WPA2-PSK takes about 20ms, so nobody cares.  (It takes several seconds to scan all the channels to see who you want to associate with, though.)

2014-09-29 »

I was also willing to serve on a demotion committee, but it seems nobody self-nominated this year.

2014-09-30 »

It was time to learn how 802.11 authentication works, so I started reading (well... heavily skimming) the standards.  Here's what I've got so far.

WPA2-PSK does not actually use EAP (Extensible Authentication Protocol).  WPA2-Enterprise does; in that case it uses EAP to do some arbitrary configurable thing to produce the PMK ("Pairwise Master Key"). For example, EAP is the place where you might contact a RADIUS server.  Or alternatively, with WPA2-PSK, the PMK is trivially constructed by hashing some stuff like your password, ssid, and MAC addresses.  No packets are sent at all for this, and you have the PMK already.

Once you have the PMK, we do the infamous wifi "4-way handshake" to produce and exchange transient session keys.  The handshake is the same regardless of what EAP method you use, or if you don't use EAP at all.  Confusingly, the 4-way handshake seems to use packets called EAPOL (EAP over LAN) packets, but that's actually just a container which in this case happens to not contain EAP.  Ha!  Fooled you!  Also, there is a thing called EAP-PSK which is intimately unrelated to all this.

The 4-way handshake seems to take about 20ms if all is well, which it usually is.

Occasionally, you want to rotate session keys inside a session.  To do that, you just re-run the 4-way handshake without getting a new PMK.  Notably, EAP and thus RADIUS do not ever get involved here.  (It's good to know when RADIUS gets involved because it's typically annoyingly slow and unscalable.)

So okay, roaming.  In general, wifi clients are responsible for choosing which AP they will talk to, and not the other way around.  Occasionally a client will decide it's getting a crappy signal, and see about finding another AP with the same SSID that it can switch to instead.  At that time, it disconnects from the old AP and re-runs all of the above, potentially including EAP and RADIUS if that's what you're using.  This can be slow: EAP can take several seconds.  To try to hide the slowness, they invented 802.11r.  The purpose of 802.11r is simply to let you do an EAP transaction for AP#2 while still connected to AP#1.  Basically you use it to calculate a new PMK.  After that, you can disconnect from AP#1 and connect to AP#2 and do the (20ms or so) 4-way handshake, all in a very short time.  Great!  But this doesn't work for WPA2-PSK, for the simple reason that there is no work to do to get PMK#2. So 802.11r has nothing to do with that.

There are some other standards that also come up when talking about roaming:

  • 802.11h: used for measuring signal strength and controlling maximum transmit power.  Apparently needed if you want to comply with DFS (radar avoidance, essentially) regulations on some channels.  Contains some interesting signal strength reporting tool though, so could be used to help inform roaming decisions.
  • 802.11F (why a capital letter? why not?): "inter-access point protocol."  Nobody ever used it, so it was cancelled.
  • 802.11v: a new thing that appears to be mainly useful for helping APs fix up their bridging tables more quickly.
  • 802.11k: "radio resource measurement."  The only thing out of all the above that could possibly allow you to steer a client device to a particular AP.

802.11k is pretty special.  First of all, it's not supported by hostapd, so most APs don't support it, and most Linux devices don't support it.  I see a bunch of announcements about iOS supporting it, but not MacOS X, which seems weird, but there you go.  Also, there is no actual command in 802.11k to actually push a client device to a particular AP; instead, the AP is supposed to wait until the station asks, and then send it a "list of eligible neighbour access points" that the station might want to connect to.  There's no guarantee the station will take the advice.  Also, there's no good way to poke a station to make it consider moving right away; we rely on the device to do something intelligent.  Which is, of course, way too optimistic, when it comes to most devices.

That gives me a better clue of what people mean when they say the standards are "not that useful" for assisted roaming.  Few people bothered to implement 802.11k.

So what do people do instead?  It seems mostly they have APs forcibly disconnect a client device, then the "wrong" AP will refuse to let the client reconnect to that AP, thus forcing it to connect to the "right" AP.  The problem with this method is it's slow: you can't use 802.11r EAP pre-authentication, because you're not connected to any AP at that point.  The client also has to scan all the channels, which can take several seconds.  During all that time, the wifi has been disconnected and your sessions are all frozen.  None of this slowness has anything to do with problems that can be fixed with 802.11r.  They can't even particularly be fixed with 802.11k since it can't force a roam either.

August 2014
October 2014

I'm CEO at Tailscale, where we make network problems disappear.

Why would you follow me on twitter? Use RSS.

apenwarr on gmail.com