Me: We wouldn't be so *presumptuous* to assume we could just make a product without talking to actual users in the market first. What do you think we would we do, just show up and *tell* you what you want?
Co-worker: You can't actually say that with a straight face, can you?
August 7, 2010 23:58
These three programming environment "features" all have one characteristic in common that makes them bad: non-repeatability. If you run the same program more than once, and it uses any of those three things, then chances are it won't run identically every time. Of course, if your program is written correctly, it ought to produce the same effective results every time, but the steps to produce those results might be different, and the output itself might be different, if effectively identical.
For example, in a threaded map/reduce operation, the output of each parallelized map() will reach the reduce() phase at different times. Supposedly, the output of reduce() should be the same regardless of the ordering, but it doesn't mean reduce() is performing the same calculations.
Imagine you're running a map/reduce on an original Pentium processor with the infamous FDIV bug, and your reduce() includes a division operation. Depending on the order of inputs, you might or might not trigger the bug, and your result might be different, and you'd be left wondering why. That's the problem with non-repeatability. Even without the FDIV bug, maybe your code is buggy, or maybe you're just introducing rounding errors or int overflows; the ordering can change the result, and debugging it is hard.
A more common problem is well-known to anyone using threads; if you don't put your locks in the right places, then your program won't be "correct", even if it seems to act correctly 99.999% of the time. Sooner or later, one of those race conditions will strike, and you'll get the wrong answer. And heaven help you if you're the poor sap who has to debug it then, because non-reproducible bugs are the very worst kind of bugs.
Garbage collection and non-determinism
But everyone knows that locking is the main problem with threads. What about the others?
Non-reproducible results from garbage collection go hand-in-hand with non-deterministic destructors. Just having a garbage collector thread run at random times can cause your program's GC delays to move around a bit, but those delays aren't too important. (Except in real-time systems, where GC is usually a pretty awful idea. But most of us aren't writing those.)
But non-deterministic destructors are much worse. What's non-deterministic destruction? It's when the destructor (or finalizer) of an object is guaranteed to run at some point - but you don't know what point. Of course, the point when it runs is generally the point when the GC decides to collect the garbage.
And that's when the non-repeatability starts to really become a problem. A destructor can do anything - it can poke at other objects, add or remove things from lists or freelists, send messages on sockets, close database connections. Anything at all, happening at a random time.
Smart people will tell you that, of course, a destructor can do anything, but because you don't know when it'll run, you should do as little in the destructor as possible. In fact, most objects don't even need destructors if you have garbage collection! Those people are right - mostly. Except "as little as possible" is still too much, and as soon as you have anything at all in your destructor, it starts spreading like a cancer.
In the .net world, you can see this problem being hacked around every time you see a "using" statement. Because destructors in .net are non-deterministic, some kinds of objects need to be "disposed" by hand - back to manual memory management. The most common example seems to be database handles, because some rather lame kinds of databases slurp huge amounts of RAM per handle, and your web app will grind to a halt if it produces too many queries in too short a time without explicitly freeing them up.
But no problem, right? You can just get into the habit of using using() everywhere. Well, sort of. Unfortunately, objects tend to get included into other objects (either using inheritance or just by including member objects). What if one of those member objects should be dispose()d explicitly when your container object is destroyed? Well, the containing object now needs to implement its own dispose() that calls its member objects' dispose(). But not all of them; only the members that actually have a dispose(). Which breaks encapsulation, actually, because if someone adds a dispose() to one of those member objects later, you'll have to go through all your containing objects and get them to call it. And if you have a List
- In another dispute, over which province should own what is now the
northwest of Ontario, the [British] Judicial Committee sided with Ontario
against Manitoba and Ottawa. The ruling still makes no sense. The people of
the region still treat Winnipeg at their national capital. Why? Because it
is their geographical capital. Toronto and the rest of Ontario belong to a
distant, different world.
-- John Ralston Saul, A Fair Country (p.162)
That was in the early 1900's, apparently.
I always wondered how the screwy shape of Ontario had come about; it figures. The actual Canadian federal government (Ottawa) thought it would make sense to lump us in with Manitoba, but somehow the British overseers thought otherwise.
If you're from Northwestern Ontario, here's a fun game you can play with your friends from Southern Ontario. First, take a paper map of Ontario. (I know, paper maps? What are those?) They're generally printed with Southern Ontario on one side and Northern Ontario on the other. In Southern Ontario, find two towns that are about an hour apart, and point them out on the map.
Now flip over the map and find two towns that look about the same distance apart, and ask your friend to estimate how far apart they are. See if they remember to check the map scale - most people don't realize that the Northern Ontario side is drawn much smaller, because the land is absolutely huge by comparison and has a much lower population density.
Now imagine you're working for the Ontario government - down in Toronto -
and you still haven't realized this.
August 10, 2010 14:00
- "Look, theory says that a JIT can run as fast as, or maybe faster than, a
statically compiled language. It might be slow right now, but it'll be much
better when we get a real/better JIT. Plus, the new version is already a
lot faster, and I'm looking forward to the next version, which they promise
will have huge speed improvements."
-- Every Java user since 1996
If you've been saying the above about your Android phone (or Blackberry), then you, too, have become part of the decade-and-a-half-long train wreck of computer science that is Java.
I'm often mystified at the rejection of reality displayed by the proponents of Java-like virtual machines. It seems a simple statement of fact: even after 14 years, Java is still much slower than native code, and you can see it clearly just by looking at any app for 10 seconds. And yet the excuses above keep coming. 14 years.
But then I think, I know how this delusion works. I've been guilty of it myself. At my first company, I pushed to have all our data interchange sent through an API that I designed - UniConf - which was unfortunately slower in almost all cases than not using it. The idea was that if only all our code could be 100% pure UniConf then we'd suddenly be able to realize tons of wonderful advantages.
But despite herculean efforts, the advantages never materialized. What materialized was a lot of slowness, a lot of excessive memory usage, and a lot of weird bugs that forced us to backtrack through seven layers of overly-generalized code to diagnose.
Luckily for me, lack of resources prevented my own madness from spreading too far. I'm much better now.1
But what would it be like if the madness had been successful? What if I had been responsible for a system that spread to millions of users worldwide, which in nearly every case made things visibly and obviously worse? What would that do to my psyche? I think it would be unbearable.
Which brings us to Java-like VMs on cell phones. I have a lot of sympathy here, because:
Java used to be a good idea. Really.
Java on cell phones has not always been obviously a bad idea. To see why, you have to understand a bit about how these systems evolved.
First of all, we have little visibility into the Java's original reason for being. We know what people said, but we don't know if they said that for marketing or retroactive justification. What we do know is that the original sales push behind Java was applets for your web browser. Rich, client-side web applications.
Client-side web applications have exactly one super difficult critical requirement: security. You're downloading random apps from the Internet automatically and you want to run them automatically, and some of these apps will definitely be written by evil people and try to screw you, so you need a defense mechanism. Moreover, most people doing this will be doing it on Windows, which at the time meant Windows 95, which had no actual security whatsoever. Any native code could do anything it wanted. This situation persisted, mostly, up to and including Windows XP. (NT-based kernels have security, but the average person just ran everything as an administrator, negating literally all of it.)
So the typical user's operating system provided no strict memory protection or any other security features. This is where Java made perfect sense: if you can provably enforce security at the application layer, you can make a virtual machine that actually includes these missing security features, thus making it safe to run random applications on the Internet, and propelling us into the Internet Age. Sweet.
Unfortunately, nowadays the vast majority of Java apps never use any of Java's security features; they run as apps with full user rights, either on the client or on the server. So that advantage of the VM is gone... and the Java VM has no other advantages.3 But people, having been fooled once, kept going on the path they were already on.
Now ironically, the real problem was not natively compiled languages, but Windows (or to be generous to Microsoft, "the operating systems at the time"). Anybody who has studied computer science knows that modern processors capable of virtual memory were designed around the idea of keeping untrusted apps under control. Once upon a time, people used to actually share time on Unix machines. Lots of people on a few machines. And they were largely prevented from stomping on each other. The exceptions were security holes - fixable mistakes - and VMs have those too.
It is really not that hard to lock an application into a protected environment when your processor includes security features. Just google for chroot, BSD jail, AppArmor, SELinux. Yes, some of them are a little complex, but security is complex; nobody ever claimed Java's security architecture was simple.
Of course, if I had said that five years ago, you might not have believed me; you might have said those systems weren't secure enough, and that Java was somehow more secure in ways you couldn't quantify, but that application-level VM security is just somehow better somehow, I mean look at the virus situation on Windows. And I wouldn't be able to argue with you, because that's not even a logical argument, but it sounds vaguely convincing. And so the world went.
Then Apple came along and made the iPhone and its App Store and all the apps are native and the thing is still secure and apps can't stomp all over the system. Again, modulo security holes - fixable mistakes - which VMs don't eliminate. Here everybody was, going along with the above illogical argument in favour of VM security because they couldn't argue with it, and Apple just ignored them and showed it was all wrong. You can make native code secure. Of course you can. People did it in the 1980's. What on earth were we thinking?
But I'm getting ahead of the story a bit. Now I've told you why Android's use of a Java-like VM was demonstrably wrong (Apple demonstrated it) from the beginning, but first I wanted to tell you why Blackberries use Java, and lots of old cell phones used Java, and that wasn't obviously wrong.
The reason, of course, is that when Java was first applied to mobile phones, mobile phones didn't have processors capable of protected memory. Those processors were really low powered; security was impossible. Before Java, you could write custom native apps for a Blackberry... as long as you gave your source code to RIM to have them review it. Because native code could do anything, and there was physically no way to stop it once it got onto the device. Other phone manufacturers didn't even bother.
At the time, the first inexpensive embedded processors supporting protected memory were years in the future. If you could have a way to safely load third-party apps onto your phone... well, wow. You'd rule the world. You wouldn't just have a phone, you'd have a platform. This was not silliness, not at all. A Java VM was the first serious possibility of making a mobile phone into a serious, flexible, reconfigurable application platform.
It didn't work out very well, mostly because of Java's slowness and crappiness and licensing and (in the case of Java ME) horrendous lack of standardization. But GMail and Google Maps worked on my Blackberry, and millions of enterprise Blackberries are deployed running thousands of custom legacy enterprise apps you've never heard of that will make transitioning big established companies from Blackberry to iPhone virtually impossible for many years. In this case, pure thickheaded brute force did manage to win the day.
So okay, for the same reason that Java VMs started out as a good idea on Windows - namely, the platform itself lacked any security features - Java VMs made sense on phones. At first.
But embedded processors don't have those limitations anymore. They're serious processors now, with protected memory and everything. Most importantly, these processors were available and being used from the first day the first Google Phone was released. You no longer need a VM for security... but that means the VM doesn't provide any advantage at all.3
The fact that an Android phone has tolerable performance is, again, a triumph of pure thickheaded brute force. If you throw enough geniuses at a difficult technical problem, you might eventually solve that problem, even if the problem was stupid, and in this case, they mostly did.
But every step of the way, they're going to have this giant anchor of
UniConf Dalvik tied around their neck, and Apple won't, and
Apple's native apps will always run faster. It's going to be
Maybe the speed won't matter. Maybe computers will get so fast that you just won't care anymore.
Java users have been saying that, too, since 1996.
1 I hope
3 To pre-emptively refute a few common claims: "Write once run
anywhere" doesn't actually work because the compiler was never the main
problem; differences in OS semantics is the main problem, and you have to
solve those equally for your apps in any language, even Java. Garbage
collection can be and is frequently done in natively compiled languages.
Introspection can be done in natively compiled languages. Digital signing
of shared libraries can be implemented by any native shared library loader.
Cross-language integration can be and is done all the time in native
languages; in fact, VMs make this much harder, not easier, since now you
have to rewrite all your languages. Sensible threading primitives (which
some would say Java lacks anyway) can be implemented in any sensible
language, natively compiled or not. Profile-driven optimization can be done
in compiled languages. Support for multiple hardware architectures is just
a recompile away - just ask any Mac developer. Provable memory protection
(including prevention of all attempted null pointer dereferences) is doable
and has been done in statically compiled languages. And before anyone asks,
no, C/C++ does not do all
these things; you need a good language. My point is that the good
language needn't run in a VM; the VM is a red herring, a distraction.
August 26, 2010 19:23