100% Pure

accept no imitations

Everything here is my personal opinion. I do not speak for my employer.
Back: November 2006
Next: January 2007

2006-12-01 »

Novel, yes; widespread, no

That's right, kids, November is over and while my novel isn't "done," it did indeed pass 50000 words. I have a tiny bit of work left to completely wrap up the story, and a little bit more tweaking before it's up to my usual questionable quality standards (as opposed to its current "unquestionable" quality). That should only take a few more days.

And after that, you, yes, you, the one and only, will still not have access to read it, because I'm not currently planning to post it publicly anywhere, lest Google reveal to all the world that I'm an idiot. But I'd probably send you a private copy if you asked.

Old news, new acronyms: REST vs. SOAP

I've been seeing news in random places around the web lately announcing the official death of Web Services and SOAP in favour of REST, or was it XML Schemas in favour of RelaxNG, or whatever bad things the people in question don't like in favour of whatever good things the people in question do like. You can find many of these links by following links from Tim Bray's Ongoing.

And then comes the inevitable reply, They Can't Hear You, explaining that these so-called "deaths" are rather overstated, because all of the supposedly "bad" technologies are still very much in use in big companies everywhere, and moreover, most people in those companies haven't even heard of the so-called "good" technologies.

Now, of course I'm always totally in favour of a good flamewar, but if the flamewar is good enough, then I prefer to confuse the issue rather than resolving it. Thus, I bring you my considered opinion:

Yes, it seems those bad things really are dead. But they're not. Except where they are.

You see, declaring SOAP dead in favour of REST is equivalent to declaring Windows dead in favour of Unix. No, no, not "just as stupid." I mean equivalent. If one was true, the other would probably also be true. And you may have noticed that Unix has totally prevailed. But it hasn't. Except in the places where it has.

A while ago at NITI I did a presentation about functionality vs. elegance. Basically, I argued that people often treat them as a single continuum, sacrificing features for beauty or vice versa according to their priorities. But really they're two independent dimensions, and a small few things are functional and elegant.

In my example at the time, I showed how Windows isn't elegant (well duh) but it is highly functional; you can make it do anything. Unix, on the other hand, was an amazing achievement in its heyday because it was highly functional and highly elegant. Because this was originally true, people experience the beauty of Unix and then tend to believe that "the Unix way" is the best way to solve all their problems.

Unfortunately, original Unix was designed for processing text on a command line. Its elegance came from clever, well-executed simplifications like the unified filesystem, devices-as-files, pipelines, and everything-as-text. Unfortunately, none of those things have anything to do with graphical user interfaces. X11 is anything but elegant, and none of the various Unix widget toolkits are earth-shatteringly great like Unix was. Furthermore, trying to map elegant command-line stuff into GUIs has been a consistent, horrible failure; compare a Windows-based IDE debugger to any GUI wrapper on top of gdb, and just try to tell me I'm wrong.

To make something both highly functional and highly elegant, you need what we call a simplifying assumption. In the command-line world, the credit typically goes to the invention of Unix pipes, which let you easily link small tools together to accomplish a big job. Windows has no such magic (well it does, but they forgot to support it in the shell), and so it predictably sucks at text processing. But Unix has no simplifying assumption in its GUI. So they have to trade off between functionality and elegance. In typical Unix fashion, this means there are a zillion half-baked alternatives, each at different points along the tradeoff continuum.

But there are places where Unix's traditional elegance is still a win, and that happens to be the world of Internet servers. The web works because it mostly just paraphrases Unix's cleverness. It has a unified filesystem (er, URI space), devices-as-files (er, CGI scripts generating content), and everything-as-text (html). Sadly, its UI mostly sucks, except for the AJAX stuff where we basically hack the hell out of Javascript and CSS to mangle it into a decent UI, at the cost of almost all the operational elegance. That's because what's missing from the web model (and the Unix GUI, for that matter) is an analogue for Unix's pipelines - I can't easily connect one little thing to another little thing to generate a high-quality big thing.

The fancy modern name for that is a component system. Unix pipelines are one kind of component system, although people who talk about Component Systems as if they had Capital Letters would probably gasp at my saying so. That's because those people design the second kind of component system: things like COM. You know: the way Windows does it.

So here's the thing. In the world of the web, there are two competing ideas for component systems: REST and SOAP. And these correspond almost exactly to Unix and Windows, or in other words, pipelines and COM. Pipelines contain squishy (nowadays Web 2.0 people would say "mashable") text in which the format is flexible and implicit and probably being parsed only 99% correctly. That sounds to me exactly like REST and microformats and RSS and friends.

Meanwhile, there's SOAP and SOA and WS-Whatever and XML Schemas. They're just like COM: useless unless your syntax and semantics are exactly right, semi-easily checkable for valid syntax, and purportedly self-describing in an essentially useless way (except for syntax checking purposes). Self-describing or not, Google will never be able to index your SOAP service without a special plugin, just like Google Desktop can't index any of your binary document formats without special COM plugins, one per format.

Phew. Okay, so that was the background information. Here's my point: yes, the whole Internet runs on Unix philosophy. But businesses sure don't. The big problem comes up in my description of pipelines up above: they only parse about 99% correctly, which is fine for your idiotic comments about YouTube videos, but pretty nasty when you mangle critical business data. And when the business dudes get involved, they'd rather do anything than mangle their critical business data. You hear me? Anything!

If it keeps the data from getting mangled, they'd happily sacrifice searchability. Or developer hours. Or the ability to use off-the-shelf software. Or millions of dollars in licensing fees (because at least they'll accurately know how many millions).

Pay attention, because that attitude is exactly why Windows is so strong, why the vast majority of developers prefer to develop on Windows, and why the vast majority of users prefer to use Windows. It may be gross, and adding new features or libraries may be a lot like stabbing yourself repeatedly with a fork (which, tragically, is not itself included in Win32), but Windows works consistently. So does SOAP. But not Unix or microformats or REST.

It would be awesome if someone could find a way to satisfy both camps (the Internet people and the Enterprise people) at once. Then maybe one or the other set of technologies could finally die. But I'm not counting on it. Until then, one or the other technology is effectively dead, but which one it is depends who you are.

December 2, 2006 03:55

2006-12-05 »

Not a Java rant

Today I used J2ME ("Micro Edition", for handheld computers, etc) for the first time.

I have a whole rant in me about how stupid it is. But writing it won't help, because more than enough people have already ranted about Java, and it didn't help them either.

So I'll just say this: the class library that comes with J2ME is *significantly* less useful than the ANSI C standard library. And that's saying something, because my Crackberry wastes 64 megs of flash doing it.

Not a C# rant

I have a similarly short-winded statement for C#:

Yes, it looks almost exactly like Java. But the difference is *huge*. If anyone ever wants to do a study about how "the devil is in the details," this should be your sample data.

Balance of Implicity

I've decided that your ability to quickly communicate(*) depends entirely on how much of the communication can be made safely implicit. The compromise tends to be between making things implicit (which I'm pretty good at) and making things safe (which "typical" engineers like to do).

It's easy to see why saying things implicitly instead of explicitly makes it go faster: implicit things don't actually have to be said, so they're zero work compared to non-zero work. And it's also easy to see why that makes things less safe: if you don't say it explicitly, how do you know the other person (or the computer, for that matter) has assumed what you hoped they would assume?

Once you're thinking in terms of safe vs. implicit, it becomes easier to talk about various aspects of design. For example, stable, long-lived public APIs (eg. Win32) tend to be highly explicit, because they don't assume anything about what exactly you'll want to do. That way, they seldom have to change. On the other hand, internal APIs can assume much more about how they'll be used, because if the assumptions are wrong, you can just fix them on a whim; and you should do it that way, if not too many people depend on your APIs, because it'll save you vast amounts of time in the long run.

On a completely different note, legalese has the same problem for a different reason. Lawyers are paid to deliberately misunderstand anything that's left implicit in a contract, which is how they can help you weasel out of one. To be implicit requires understanding; if it's not in people's best interests to understand, then you'll have to just make everything explicit. And that's why a simple one-page agreement can bloat to 200+ pages if you pay enough lawyers to look at it.

I think the best answer, if you can pull it off, is to be safely implicit: don't state the obvious, because it's obvious. But you have to be really good to know what's obvious, and if you screw it up, you would have been better off just saying it outright.

(The really expensive lawyers know what they can leave out and still have it be legally indisputable; they write shorter, more readable contracts that are just as enforceable. But will that make you feel more safe, or less safe?)

Specialized jargon is a very interesting example of "safe implicity." When you use a particular unusual term ("Feature Pusher" like we used to say at NITI) instead of the more usual one ("Project Manager" like we now say at NITI), it makes people stop and examine their assumptions. If they don't know what a Feature Pusher is, they have to go look it up somewhere, and the answer suddenly becomes explicit. If they did already know what it was, they don't need to look it up, and you've included a huge amount of information implicitly.

But be careful! You usually have to invent whole new terms if you want that to work, because when people see a term they've seen before, it has a whole lot of baggage attached to it. A Feature Pusher at NITI wasn't the same as a Project Manager; now, after the name change, suddenly it is, which is why it works so much more poorly.

When lawyers write contracts, they solve this conundrum by using Capitalized Words like "Person." (Quite often, a Person in a contract isn't actually a real person at all, but the capital letters are a signal to other lawyers that they need to look back in the definitions section to find out.)

* Communication also includes all of programming, since programming is just communicating to a computer and/or any human who reads your code later.



December 6, 2006 02:03

2006-12-06 »

So, how was your day?

Today I volunteered to save a $50mil deal by solving a math problem. Banks are surreal. Having worked at a software company that I deliberately tried to make as insane as possible, I really didn't think I'd ever find myself saying that.

Say hello to the "real/actual expert!" Oh yes, my Elite Management Science skills are rusty, but not yet gone.

REST is searchable

pphaneuf doesn't believe me when I say that REST interfaces are indexable without special plugins. I currently think they are, although it wouldn't have occurred to me until I read STREST (Service-Trampled REST by Duncan Cragg.

Basically, the idea is this: REST is a design model that uses URLs in a particular way. The key thing about it: browsing the contents of a service should always be harmless and doable using just GET; if that's true, then the GET requests will respond with URLs that point to other GETtable data, and so on. Even if it can't understand exactly what the content is, Google will be able to index it. And that's very cool.

Contrast that with a SOAP service that has a getListOfFiles() function you need to call; Google doesn't stand a chance. (Though now that I've said it, watch them try to prove me wrong. You know who you are, former NITIites. Trying to make me look bad!)

December 7, 2006 03:48

2006-12-08 »

Dedication

I think I would like to work for a company where people actually make the effort to come in to work even when the rest of the city gives up completely because they received half the average annual snowfall in a single night.

Hmm, it seems that I do. And there's lots of room in the parking garage today, because of all those other, lamer companies.

December 8, 2006 15:39

2006-12-09 »

Dueling Ambiguities

A sign in the lobby of my apartment building today said, "Elevator will be shut down for servicing tomorrow from 11am to 2pm."

And someone had scrawled underneath, "Is tomorrow today or tomorrow?"

This struck me as interesting because the latter sentence would have been completely meaningless except in the meaningless context it was in. But in that context, it's a perfectly reasonable question. Sensical nonsense is one of my favourite literary forms.

December 10, 2006 09:08

2006-12-13 »

Banks mining personal information?

pcolijn wrote about his suspicions that banks are "mining the data and no doubt selling it to all kinds of sketchy advertising companies" (referring specifically to CIBC in Canada).

As a newly-minted representative of the banking industry(*), I can tell you that this is actually nothing to worry about. First of all, Canada has pretty serious privacy laws that prevent people from doing various kinds of underhanded things without your permissions. Of course, you've probably signed away that permission by now. But secondly, banks are especially tightly regulated, and they just can't do that sort of thing, period.

Banks are only allowed to collect information about you that they need to run their business: in the case of credit cards, that means where you made a purchase and for how much, but not what you bought. And their own technology prevents them from collecting that information: they deliberately separate the credit card reader machines from the cash registers. A cashier enters the amount from the cash register into the reader, then you swipe your card and pay that amount. That means the bank never gets more information than the price, and the cash register never gets to see who you are or what your card number is. So the bank can't mine your product preferences, and the store can't even correlate one sale to the next. The store computer simply doesn't know who you are.

The bank could mine your store preferences, but they're not allowed to. And every single thing a bank does is examined carefully by government regulators, so it just doesn't happen. And even if it did happen, if they ever sold that information to someone else, the regulators would certaily have a heart attack. Banks only mine this information for security and fraud detection purposes. Or, in the case of CIBC, what seems to be an honest feature they created to actually help their customers. I know, I can't believe it either.

For stores, on the other hand, there are various ways to get around the mining restrictions. Any time you give a store your personal information, it becomes an information free-for-all. For example, loyalty/points cards do go through the cash register, precisely because the store wants to correlate your purchase habits. And cross-store cards like Air Miles mean advertisers can, and very much do, correlate your purchasing habits across stores. You didn't think all those stores just wanted to give you free stuff for fun, did you? That (combined with my laziness) is one reason I avoid loyalty cards.

I have no idea whether any of the above is true in the U.S. I do know they have rather, er, lax or nonexistent privacy laws. So watch out.

(*) Disclaimer: Everything I write anywhere is, at most, my own personal opinion, and sometimes not even that. I don't represent anyone or anything in the banking industry, least of all my employer, past employers, future employers, or clients, none of whom I expect to have any opinion on any topic whatsoever. But I don't represent them when I say that either. Please don't sue me. Thanks.

December 13, 2006 14:52

2006-12-15 »

Notoriety

I don't have a point. I just needed to link to this cartoon, because it's astonishingly true. Don't forget to read the mouseover.



December 15, 2006 14:04

2006-12-18 »

Alumnit

The first annual Alumnit Christmas party seems to have been a success, with almost nobody killed or MIA. And they even had special fireworks just for us!

Reality

During the aforementioned party someone from next door came to visit to ask us to keep the door closed, because when it's open noise will leak into the hallway and then into other people's rooms. They said this was because they were "filming a movie."

I've now gotten some clarification: two apartments on my floor are actually being rented to a Quebecois reality TV show which has been going on for several weeks. This explains in a much more understandable, yet somehow disappointing way, why I've met so many astonishingly beautiful French women wandering the halls around there lately.

December 18, 2006 14:57

2006-12-24 »

Developer salary negotiations

Here's an excellent article about why developers shouldn't have to negotiate about salary. Short answer: they're bad at it and that's one of the reasons you hired them. But the article says it all much more eloquently than that.

December 24, 2006 21:32

2006-12-27 »

Please, please, steal my idea!

Now that I'm once again spending my time in the outside world, which is mostly Weaverless (and generally happy about it), there are a few things that grate on me endlessly.

The #1 worst one, by far, is everybody else's DHCP servers. In Weaver, we spent rather heroic amounts of effort making sure the DHCP server could do two things reliably: first, your workstation will always get assigned the same address every time it makes a request (as long as that address is still available). And second, if you moved a workstation from a static IP to a DHCP-assigned IP, the DHCP server would keep handing you that old static IP you used to use, as long as it was in the DHCP pool.

Everybody else's DHCP server software just gives out a random address from the pool every time. If you release/renew, you get a different random address. And if you switch a node from static to dynamic, it gets a different random address, just like everyone else.

The "everybody else" behaviour leads to a feature requirement that everyone seems to implement, called "DHCP reservations," in which you can (manually) tell the server that a particular MAC address should always get a particular IP address. If you do this for every workstation, you can get reasonable release/renew behaviour, because your dynamic addresses aren't dynamic anymore. Well, good for you. Weaver went for years without this feature, because the way it hands out DHCP addresses makes the feature perfectly unnecessary. Just set your node to the IP you want it to use, ping something, and switch it back to DHCP, and it'll keep getting assigned the same IP forever. (Eventually we implemented reservations because it's easier to give people what they want than to convince them they don't need it.)

A related Weaver feature that I sorely miss is auto-registration of hostnames in the DNS based on their WINS broadcasts. You're not supposed to do this, of course; WINS broadcasts are for WINS, not DNS. But it's so handy, because just starting up samba on a Linux machine suddenly makes its name show up in DNS. Windows machines don't care about DNS, of course, so Windows users never cared about this feature, but if you run Linux on your network, the feature reduces your DNS name maintenance (and the need to refer to local hosts by IP address) down to approximately zero.

Now, really, if I had just one of these features - a sensible DHCP server or an auto-registering DNS - my life would be okay, because I'd plug in a new machine, and with at most a one-time admin effort, it would have a name and an effectively static IP. But with neither - which is what all non-Weaver networks have - you have machines that hop IP addresses all the time and no way to register those machines in DNS! So you can't refer to them by name or by number. Just shoot me.

So here's my plea: someone, anyone, please steal the DHCP server idea and put it into your DHCP server software. Its basic form is trivially easy: when someone does a DHCPRELEASE, ignore them. (It's effectively a "please be stupid" request, and there's no law requiring you to be stupid.) When they do a DHCPDISCOVER or DHCPREQUEST, even without asking for a particular IP, look in your lease table and see if you have any (even released or expired) leases for that MAC address already; if so, give out the same one again. If not, now you can choose that random address, exactly once per MAC address. And if you're forced to choose an address but you've run out (because the above algorithm never expires a lease), reuse the oldest one first. You could even reuse DHCPRELEASEd ones before non-released ones, which maintains the spirit of DHCPRELEASE without being stupid.

If you think about it, the above isn't actually any harder at all than the boneheaded algorithm everyone uses. But it's so much better it makes my head explode. Ouch. There it goes now.

December 28, 2006 10:45

2006-12-28 »

Influence

Here's a very concise summary of six factors people can use to manipulate you. The explanations are great, but the suggested self-defense techniques are pretty useless. Somehow that doesn't surprise me. Still, just being aware of these techniques can make you a bit more aware when people are using them on you.

December 28, 2006 20:42

2006-12-31 »

Okay, this is the very last entry from back when I was having my kidney vacation. My mindset has changed a bit since I wrote this in my notebook, but I'll paste it here anyway for posterity, and because the Crazy Pills were a pretty good source of unusual thinking.

Delirium 4: Isolation

(Written around September 1st, 2006)

Before, I wrote about my precious instability that comes from keeping all my options open, constantly re-evaluating which direction I'll choose, but using a generally convergent decision algorityhm such that it seems like I'm consistent.

Given my subsequent career adjustments, I think I can claim that I at least wasn't totally lying about the whole thing.

But two recent events outside of my control have reminded me how, despite my best efforts, I've been "stabilized" more by the system than I had realized. Luckily for me, fate was conspiring to unlock my PLL, so to speak, and I'm more free now than I had ever realized I wasn't.

The two particular events I'm speaking of are my kidney vacation and the theft of my laptop.

First the laptop. The whole way it was configured - Linux, ion, text files, mutt, and so on, was based on the way I did my work before - in other words, my prior career. I have a feeling that my new work will be a lot more cross-platform and a lot more high-level, involving a lot more modern tools (like .NET for example). And I probably won't be developing operating systems for a while... so why not rethink it all from scratch? I think maybe I'll get a Mac this time. (Update: I did.)

Secondly, my kidney vacation really got me thinking. No, no, not about the fagility of life and the meaninglessness of it all and that kind of fluff. Come on, it was a kidney vacation, not a kidney failure.

No, it just gave me a lot of time to think about things I never think about because I'm too busy. And because of the absurd coincidence of having my laptop stolen right before I entered the hospital, I was really disconnected from all that stuff, probably more than I have been for 10 years. I spent a lot of time just staring at the walls and thinking, and I rediscovered something very important - I still like doing that. As much as we can talk about zone time and the genius of obsession, sometimes just sitting and thinking about stuff is really what's needed.

I think the most important that came to me during this time is just how little actually matters toward the final results in the end. I alluded to them in Imbalance and The Genius of Obsession but it deserves its own statement outright. Designers - like I try to be - tend to obsess about every little detail of their design and implementation, and woe be you if you stand in their way. And this obsession with details is, I think, a critical part of the genius creative process itself.

But I believe it's caring about the details that matters. The details themselves, the vast, vast majority of them, actually don't matter at all. That's why so many different creative ingenious things are cereated so differently by so many different kinds of people and teams in so many different places. All youneed to assemble are the essential ingredients and put together the essential details.

But which details are the essential ones? Well... I'm out of the hospital now. Perhaps I'll never know.

Attribution: this note is at least partially inspired by Stephen Wolfram's A New Kind of Science, which has an excellent discussion of how evolution actually reduces complexity rather than increasing it as most people think. His theory is that in fact, only a small number of factors need to be optimized for the system to survive - the rest are mostly just random. The argument is quite clever and I believe it can apply to all sorts of other systems, like the genius-creative systems I've been discussing.

January 2, 2007 03:29

Back: November 2006
Next: January 2007
apenwarr-on-gmail.com