Newlines and other special characters in Unix filenames
Did you ever wonder why Unix allows newline characters in filenames? Did you ever run into the problem where filenames that start with "-" cause problems with commands like rm? Have you ever created a directory full of files whose names were entirely just strings of " " (space) characters of varying lengths?
Unix allows filenames to be any arbitrary sequence of bytes, except '\0' and '/'. Fixing Unix/Linux/POSIX Filenames, by David A. Wheeler, is an essay about why it shouldn't.
To be honest, when I first followed a link to that essay, I was expecting to disagree. After all, the extreme flexibility of Unix filenames is a feature, not a bug! Writing programs and doing experiments is so much easier when your operating system doesn't make arbitrary restrictions!
But after reading the whole thing, I'm convinced. At the very least, disallowing characters 0x01-0x1f in filenames is completely sane, and would solve all kinds of problems with Unix scripting. It would make programming easier.
- Restricting filenames: "Programs assume it, standards permit it,
operating systems already do it."
-- David A. Wheeler
But no, prohibiting files from being named "com1.txt" for historical reasons (which Windows does - try it!) is still not okay.
The value of professional copy-editing
Working with a "real live" publisher is really interesting. This is the second time I've done it,(2) and both times it's been a very positive experience.
After hanging out on the Internet for a while, you almost stop realizing how low-quality the content is. I'm generalizing here, but I include this very journal, and in fact, this very article, in my generalization. I think I'm a pretty decent writer, but there's no doubt in my mind: a professional editor can help you say the same thing, only better.
Once upon a time, I imagined that working with an editor would be annoying; that I'd waste a lot of time fighting over how to phrase something and how my way is obviously better and gets the point across more clearly, or how your way doesn't even say the same thing. That's not really how it works. A really great editor fixes the words without changing the meaning.
In fact, thinking about how good this article could be, but isn't, makes me feel self-conscious about writing it.
It's not just me. I've been a long-time reader of Robert X. Cringely, who used to write columns for PBS and who now writes his own blog instead. (Follow the links and compare.) You can tell it's the same guy with the same clever insights, but the quality drop from then to now is tangible and saddening. His insights just don't seem as clever; he over-exaggerates and he blows his own horn a bit too much, which presumably PBS wouldn't let him do. He was better off with the restrictions.
A while ago I wrote about the Harvard Business Review, which contains, in every issue I have ever read, multiple articles with more insight than I have seen anywhere on the Internet. That's a serious claim, and I make it seriously. In their case it's not just editors, but peer review and very high publishing standards, that make sure the quality is unreasonably high.
I don't have any particular love for the dead tree format of publishing. But much worse than the change in medium is the associated change in editorial standards. It used to be so expensive to publish that it was worth paying someone to first make sure it was good; nowadays, it's so cheap to publish that the cost of hiring a copyeditor and fact checker outweighs all the other costs.(3) Thus the quality of published work has degraded badly over time.
The end result is odd: reduced publishing costs should leave more money for editing and fact checking. Instead, people think those costs should drop at the same speed, which is unreasonable unless you cut quality.
Would you pay more for quality, edited work? Really? I like to think I would... but aside from the occasional purchase of an issue of HBR, my actual behaviour says otherwise.
(1) The book is finally available for pre-ordering on amazon.com! In case you're curious, my main contributions were the introductory sections on git blobs, trees, and commits, and the later chapters about git-svn integration and git submodules. (Sadly, the book was written before I made git subtree, so I didn't get a chance for any free advertising.) You'll find my name in the Acknowledgements section that nobody ever reads.
(2) The first time was back in 2004, an article in Wired Magazine about Autonomic Computing and Nitix. The online version doesn't really do it justice; in the print version, my article was on the page beside an illustration from The Fantastic Voyage. Woo hoo! My conclusion at the time: the reason everything in Wired sounds so cool (regardless of reality) is that their editors are capable of taking anything and making it sound cool.
(3) This reminds me of an analogy to selling software: more expensive software is expected to have more expensive support costs. If you buy software for $69 in a store, the tech support had better be free. But if you pay $100,000 for "enterprise" software, they're going to charge you at least $100/hour for support... and who cares? $100 compared to $100,000 is not even worth negotiating about. Or similarly, Microsoft has to drop the price of Windows in order to sell it on Netbooks, because Netbook hardware is cheaper. It doesn't actually make sense, but people expect it anyway.
Update (2009/05/06): Fixed a typo, after several people reported it. Okay, I guess the alternative professional copyediting is to just post your stuff on the 'net and hope your friends correct you before your enemies do.
A few days ago someone mentioned to me a project that they'd worked on during their second co-op work term at the University of Waterloo. That reminded me of my second work term, which was just over 11 years ago (!!) now.
That job was in Montreal, at a company designing PowerPC-based videoconferencing systems. My job was to get their new (and very buggy) PowerPC motherboard booting with the embedded OS they had licensed, called pSOS.
I learned a bunch of things in order to do that job, notably how to use a logic analyzer to examine transactions on the PCI bus. Logic analyzers are tons of fun. But that's not the point of this story.
The point of this story is the PowerPC assembly language, specifically my favourite instruction of all time, eieio. It's short for "enforce in-order execution of I/O." A milder company might have named that instruction "barrier" or "sync" or something. But not Motorola.
(My first ever assembly language experience was on Motorola's 8-bit 6809 processor. They had a funny instruction there too: sex. It was short for "sign extend." Probably every processor ever made has a sign extend instruction, but nobody else was willing to call theirs "sex.")
Anyway, eieio. It's an awesome instruction because almost nobody ever needs to use it, but the people who do need it need it a lot. And those people, being low-level software developers, have a fun sense of humor. So you'd get code like this:
<i>something something</i> <i>something something</i> eieio <i>something something</i> <i>something something</i> eieio</pre>
You could sing your source code to the tune of Old Macdonald's Farm. It was awesome.
Larry Smith on contract work
- I tell students, "You should be looking for work, not a job." It
applies to everyone. This economy will accelerate the move towards contract
work. The expectation of permanent employment is for minds from the 1950s.
To really be useful to your organizations - you need to build flexibility
into your mindset and your skill
-- Larry Smith
git-subtree v0.2 is released
The new version has:
- real documentation!
- a new "--squash" option that helps avoid polluting your project history
It's been tested with git 1.6, but may work with earlier versions.
You can get it from git-subtree on github. Click the "download" button or clone the repository.