Three bad things: threads, garbage collection, and nondeterministic destructors
These three programming environment "features" all have one characteristic
in common that makes them bad: non-repeatability. If you run the same
program more than once, and it uses any of those three things, then chances
are it won't run identically every time. Of course, if your program is
written correctly, it ought to produce the same effective results
every time, but the steps to produce those results might be different, and
the output itself might be different, if effectively identical.
For example, in a threaded map/reduce operation, the output of each
parallelized map() will reach the reduce() phase at different times.
Supposedly, the output of reduce() should be the same regardless of the
ordering, but it doesn't mean reduce() is performing the same calculations.
Imagine you're running a map/reduce on an original Pentium processor with
the infamous FDIV bug, and your reduce() includes a division operation.
Depending on the order of inputs, you might or might not trigger the bug,
and your result might be different, and you'd be left wondering why. That's
the problem with non-repeatability. Even without the FDIV bug, maybe your
code is buggy, or maybe you're just introducing rounding errors or int
overflows; the ordering can change the result, and debugging it is hard.
A more common problem is well-known to anyone using threads; if you don't
put your locks in the right places, then your program won't be "correct",
even if it seems to act correctly 99.999% of the time. Sooner or later, one
of those race conditions will strike, and you'll get the wrong answer. And
heaven help you if you're the poor sap who has to debug it then, because
non-reproducible bugs are the very worst kind of bugs.
Garbage collection and non-determinism
But everyone knows that locking is the main problem with threads. What about
the others?
Non-reproducible results from garbage collection go hand-in-hand with
non-deterministic destructors. Just having a garbage collector thread run
at random times can cause your program's GC delays to move around a bit, but
those delays aren't too important. (Except in real-time systems, where GC
is usually a pretty awful idea. But most of us aren't writing those.)
But non-deterministic destructors are much worse. What's non-deterministic
destruction? It's when the destructor (or finalizer) of an object is
guaranteed to run at some point - but you don't know what point. Of course,
the point when it runs is generally the point when the GC decides to collect
the garbage.
And that's when the non-repeatability starts to really become a problem. A
destructor can do anything - it can poke at other objects, add or
remove things from lists or freelists, send messages on sockets, close
database connections. Anything at all, happening at a random time.
Smart people will tell you that, of course, a destructor can do anything,
but because you don't know when it'll run, you should do as little in the
destructor as possible. In fact, most objects don't even need
destructors if you have garbage collection! Those people are right -
mostly. Except "as little as possible" is still too much, and as soon as
you have anything at all in your destructor, it starts spreading like a
cancer.
In the .net world, you can see this problem being hacked around every time
you see a "using" statement. Because destructors in .net are
non-deterministic, some kinds of objects need to be "disposed" by hand -
back to manual memory management. The most common example seems to be
database handles, because some rather lame kinds of databases slurp huge
amounts of RAM per handle, and your web app will grind to a halt if it
produces too many queries in too short a time without explicitly freeing
them up.
But no problem, right? You can just get into the habit of using using()
everywhere. Well, sort of. Unfortunately, objects tend to get included
into other objects (either using inheritance or just by including member
objects). What if one of those member objects should be dispose()d
explicitly when your container object is destroyed? Well, the containing
object now needs to implement its own dispose() that calls its member
objects' dispose(). But not all of them; only the members that actually
have a dispose(). Which breaks encapsulation, actually, because if
someone adds a dispose() to one of those member objects later, you'll have
to go through all your containing objects and get them to call it. And if
you have a List