An

a day keeps the doctor away

Everything here is my personal opinion. I do not speak for my employer.
Back: October 2011
Next: February 2012

2011-11-02 »

bup.options

optspec = """
bup save [-tc] [-n name] <filenames...>
--
r,remote=  hostname:/path/to/repo of remote repository
t,tree     output a tree id
c,commit   output a commit id
n,name=    name of backup set to update (if any)
d,date=    date for the commit (seconds since the epoch)
v,verbose  increase log output (can be used more than once)
q,quiet    don't show progress meter
smaller=   only back up files smaller than n bytes
bwlimit=   maximum bytes/sec to transmit to server
f,indexfile=  the name of the index file (normally BUP_DIR/bupindex)
strip      strips the path to every filename given
strip-path= path-prefix to be stripped when saving
graft=     a graft point *old_path*=*new_path* (can be used more than once)
"""
o = options.Options(optspec)
(opt, flags, extra) = o.parse(sys.argv[1:])

I'm proud of many of the design decisions in bup, but so far the one with the most widespread reusability has been the standalone command-line argument parsing module, options.py (aka bup.options). The above looks like a typical program --help usage message, right? Sure. But it's not just that: it's also the code that tells the options.py how to parse your command line!

As with most of the best things I've done lately, this was not my idea. I blatantly stole the optspec format from git's little known "git rev-parse --parseopt" feature. The reimplementation in python is my own doing and includes some extra bits like [default] values in square brackets and the "--no-" prefix for disabling stuff, plus it wordwraps the help output to fit your screen. And it all fits in 233 lines of code.

I really love the idea of an input file that's machine-readable, but really looks like what a human expects to see. There's just something elegant about it. And it's *much* more elegant than what you see with most option parsing libraries, where you have to make a separate function call or data structure by hand to represent each and every option. Tons of extra punctuation, tons of boilerplate, every time you want to write a new quick command-line tool. Yuck.

options.py (and the git code it's blatantly stolen from) is designed for people who are tired of boilerplate. It parses your argv and gives you three things: opt, a magic (I'll get to that) dictionary of options; flags, a sequence of (flag,parameter) tuples; and extra, a list of non-flag parameters.

So let's say I used the optspec that started this post, and gave it a command line like "-tcn foo -vv --smaller=10000 hello --bwlimit 10k". flags would contain a list like -t, -c, -n foo, -v, -v, --smaller 10000, --bwlimit 10k. extra would contain just ["hello"]. And opt would be a dictionary that can be accessed like opt.tree (1 because -t was given), opt.commit (1 because -c was given), opt.verbose (2 because -v was given twice), opt.name ('foo' because '-n foo' was given and the 'name' option in optspec ends in an =, which means it takes a parameter), and so on.

The "magic" of the opt dictionary relates to synonyms: for example, the same option might have both short and long forms, or even multiple long forms, or a --no-whatever form. opt contains them all. If you say --no-whatever, it sets opt.no_whatever to 1 and opt.whatever to None. If you have an optspec like "w,whatever,thingy" and specify --thingy --whatever, then opt.w, opt.whatever, and opt.thingy are all 2 (because the synonyms occurred twice). Because python is great, 2 means true, so there's no reason to *not* just make all flags counters.

If you write the optspec to have an option called "no-hacky", then that means the default is opt.hacky==1, and opt.no_hacky==None. If the user specifies --no-hacky, then opt.no_hacky==1 and opt.hacky==None. Seems needlessly confusing? I don't think so: I think it actually reduces confusion. The reason is it helps you write your conditions without having double negatives. "hacky" is a positive term; an option --hacky isn't confusing, you would expect it to make your program hacky. But if the default should be hacky - and let's face it, that's often true - then you want to let the user turn it off. You could have an option --perfectly-sane that's turned off by default, but that's a bit unnatural and overstates it a bit. So we write the option as --no-hacky, which is perfectly clear to users, but write the *program* to look at opt.hacky, which keeps your code straightforward and away from double negatives, while letting you use the word that naturally describes what you're doing. And all this is implicit. It's obvious to a human what --no-hacky means, and obvious to a programmer what opt.hacky means, and that's all that matters.

What about --verbose (-v) versus --quiet (-q)? No problem! "-vvv -qq" means opt.verbose==3 and opt.quiet==2. The total verbosity is just always "(opt.verbose or 0) - (opt.quiet or 0)". (If an option isn't specified, it's "None" rather than 0, so you can tell the difference with options that take arguments. That's why we need the "or 0" trick to convert None to 0.)

Sometimes you want to provide the same option more than once and not just have it override or count previous instances. For example, if you want to have --include and --exclude options, you might want each --include to extend, rather than overwrite, the previous one. That's where the flags list comes from; it contains all the stuff in opt, but it stays in sequence, so you can do your own tricks. And you can keep using opt for all the options that don't need this special behaviour, resorting to the flags array only where needed. See a flag you don't recognize? Just ignore it, it's in opt anyway.

Options that *don't* show up in the optspec will give a KeyError when you try to look them up in opt, whether they're set or not. So given the --no-hacky option above, if you tried to look for opt.hackyy (typo!) it would crash when you try checking for the option, not just silently always return False or something.

Oh yeah, and *of course* options.py handles clusters of short options (-abcd means -a -b -c -d), equals or space (--name=this is the same as --name this), doubledash to end option parsing (-x -- -y doesn't parse the -y as an option), and smooshing of arguments into short options (-xynfoo means -x -y -n foo, if -n takes an argument and -x and -y don't).

Best of all, though, it just makes your programs more beautiful. It's carefully designed to not rely on any other source files. Please steal it for your own programs with the joy of copy-and-paste (leaving the copyright notice please) and make the world a better place!

Update 2011/11/04: The license has been updated from LGPL (like the rest of bup) to 2-clause BSD (options.py only), in order to ease copy-and-pasting into any project you want. Thanks to the people who suggested this.

November 4, 2011 13:35

2011-11-04 »

Avery @ StartupWeekend Kansas City, Nov 12-13

I'm planning to hang out next weekend at StartupWeekend Kansas City. I won't be starting any startups this time around, but if you're a faithful reader of my diary that hasn't unsubscribed out of boredom and you live in Kansas, let me know and maybe we can say hello while I'm in town.

Kansas City, by the way, is the site of the first major installation of Google Fiber, a project that I've been occasionally contributing to in my copious spare time.

My wife hastens to point out that it is not, however, the setting of The Wizard of Oz. Who knew Kansas City was in Missouri?

November 4, 2011 13:49

2011-11-16 »

Stuff I said at Kansas City StartupWeekend that sounded smart

I rarely get the chance to try out words of wisdom on real people before I present them to you here. So when I post something, it might turn out to be a dud, or pure gold, and I never know which. Not this time! This time you get pure, unadulterated, gold-coloured brilliance.

1. People miss the point of the "minimum viable product" (MVP... no, the *other* MVP) for startups. It does *not* mean, "release the first version with less features and then add more features later." No, we want a *minimum* viable product. The absolutely smallest set of features needed in order to get useful market information. How many features is that? Usually... zero. An MVP can be just a slide presentation, a sales pitch, a web site, a Google ad, or a customer conversation. The best MVPs let you objectively measure customer response *fast* and then tweak. One quick way to start is to make a web site that *claims* to offer the product you'd eventually want to build, and then gives a signup form, and then (oops!) crashes when people try to buy it (or sign up). Then make some web ads to send people there based on certain keywords. No, *not* a page that says "Coming Soon!" and asks for an email address. You want a real, live, signup page for what looks like a real, live product. You can add the "it works" feature later. In the meantime, since your MVP is so cheap and fast to build, you can try lots of different ones, add and remove advertised features, and see how that changes user responses. Once you have some input like that, you can make something slightly less minimal. Doing an MVP this way requires incredible self-control. Most people fail.

2. Speaking of terminology, "pivot" is misused too. People seem to think pivot is a happy-sounding word for "give up and do something different." But it's not. It has a very specific meaning based on very specific imagery. If you're running down the street, you have momentum. If you then plant one foot hard in the ground in front of you and turn, you can actually redirect that momentum in a new direction. *That* is what we mean by "pivot." When you give up and start over, you lose your position and all your momentum. But when you pivot, you keep all the stuff that's working, and you keep going from where you were before, but in a new direction. You have the same team, the same money, the same corporation, the same already-built features, and (hopefully) the same users as you did before. You use what you've already have in order to head somewhere new. Most importantly, the energy lost during a pivot is proportional to the angle of your pivot. If you only rotate by a little, you only waste a bit of your momentum. If you turn around 180 degrees, then your progress so far is actually an impediment - like when you've gone way into debt working on one idea, then start to pursue a totally different one. Pivoting is the art of choosing small rotations that let you maintain most of your speed and take advantage of your current position, while still admitting you've been running in the wrong direction.

3. No startup ever actually does what they thought they would do on day 1. Everybody pivots. "Except [company x]," said one person, "They're doing exactly what they planned." "Are they profitable?" I asked. "No." "Oh, then they just haven't pivoted *yet*."

4. The definition of a market niche. This is one of the most important lessons I learned from reading "Crossing the Chasm." It has a somewhat complicated definition of a niche, but since then I've had a lot of luck just taking the gist, roughly: If you can name a conference attended by a particular group of people, that group is a market niche. If there isn't such a conference, it's almost certainly not a niche. For example, let's say you were making a web site to help people find a lawyer. "People looking for lawyers" is a market segment, right? Wrong. There's no "I'm looking for a lawyer" conference. Lawyers are probably a market segment (although arguably, not *all* types of lawyers go to the same conferences). But *everybody* needs a lawyer eventually, and that's not a niche, that's everybody. "Startups who need lawyers" (lots of startups need lawyers and go to the same conferences, eg. StartupWeekend) are a market segment, as are building contractors and organized crime lords. Maybe you can help *them* find lawyers.

5. Your competition is whatever customers would do if you didn't exist. Let's say you're making software for producing cool graphs of statistical data. There's already really powerful software that does this, but nobody in your market segment uses it for some reason; maybe it's too hard to use or too expensive. That software is your competitor, right? Wrong! That software is irrelevant. Your customers don't want it, so even if it's competing with you, it's already lost. Your customers are probably using either Microsoft Excel's horrible chart features, or giving up and just not making charts at all. So your competitors are Microsoft and apathy, respectively. Apathy is probably going to be the tougher one. To find your list of competitors, just ask yourself what options your customers think they're choosing between. Ignore everything else.

Bonus: When presenting at a StartupWeekend-type conference... remember that the judges see a lot of businesses, and they're expecting you to have a business plan (or at least an idea of your target market and where you'll get revenue from). However, like I said in #3 above, no startup ever actually does what they originally set out to do. The judges all know that too. So your business plan is kind of a farce, and they know it, but if you don't have one, you look unprepared. So I suggest this: have a "grand scheme" and an "ideal first customer." Present them both, and where revenue comes from in both cases. Admit outright that your grand scheme will probably turn out to be wrong, and your real first customer might not be exactly like your ideal one. Basically, prove that you care about business, but you know you have to be flexible, and you're not scared of it. For a team two days into a new startup, that's all anyone can hope for.

November 8, 2011 10:03

Back: October 2011
Next: February 2012
apenwarr-on-gmail.com