port.py, portsh, and py-remoteexec

port.py, portsh, and py-remoteexec

Although I haven't been posting a lot here in the last few months, and my various open source projects have been a little quiet (sorry), I am certainly not idle. I've been working on lots of cool stuff, and some of it I think you'll actually get to see eventually.

In the meantime, here's some work that's based on some of my already-released work. Back in September 2010, I wrote about traffic shaping on a Sheevaplug and introduced my little toy script for talking to serial ports, port.py. It's like minicom, but it doesn't try to re-emulate vt100 (yikes!) and you can understand the source code. Since then I've improved it a bit, adding lockfile support and transmit rate limiting, and making it a bit more modular. Oh, and it supports sending BREAK signals now. I used minicom for years, but well, now I don't!

By rearranging those modules, a couple weeks ago I wrote portsh, a program for automatically running arbitrary commands on a remote machine via a serial port. Let's say you've got a Sheevaplug, or one of the many random embedded devices available nowadays, and you want to run some automated tests (for example) on the device. Part of the test is to reconfigure the network interfaces, and if you do that, an ssh session (for example) would be disconnected, so the most reliable way to control the test is via the serial port. Sadly, using a serial port has a lot of problems that don't matter for interactive use, but which matter a lot when trying to automate stuff:

It's slow (usually 115kbps or less).
It doesn't separate stdout and stderr.
It isn't 8-bit binary clean by default (for example, \r is translated automatically to \n in the default tty "cooked" mode).
When you do set it to a binary-clean tty mode, you get into trouble if your script ever aborts halfway and you want to restart it (ctrl-c is disabled!)
You can only have one session at a time.

portsh is designed to solve most of those problems. (Currently it doesn't really try to solve the "one session at a time" problem, but it's one step toward a possible solution.) How does it work?

Well, the idea is to start a program on the remote end of the connection that runs a non-binary protocol capable of carrying binary data. For these purposes, we use base64 encoding, which is a bit wasteful but well-defined. To offset some of the wasted bytes, we also gzip all the data before sending it through, so for some kinds of data (copies of log files, long 'ps' or file lists) the net result is faster than a raw serial port.

The portsh command line is intended to work like ssh with a command line:

   portsh ttyUSB0 ps ax

...which gives a clue about why the separation of stdout and stderr is so important. Programs like bup and sshuttle rely on that separation in order to work.

Now, as it happens, so far I haven't tried to run bup or sshuttle over a portsh connection (but I think it would work). I did, however, send binary data successfully:

   tar -cf - . | portsh ttyS0 'cd /tmp && mkdir -p foo && cd foo && tar -xf -'

(Yes, the command is just passed verbatim to system(), and thus you can't avoid the shell mangling its escape sequences. That's pretty bad for security/predictability - especially with filenames containing spaces - but I wanted to be really compatible with ssh, and sadly that's what ssh does.)

I've also successfully used David Anderson's py-remoteexec (that's Mercurial; see my py-remoteexec clone on github) through portsh to upload and run arbitrary python files on the remote end, which is where things can really get interesting.

Historical trivia: py-remoteexec is actually based on my upload yourself for fun and profit code from sshuttle, but cleaned up and generalized so it's easy to use in your own projects. Then I stole back the py-remoteexec 1st and 2nd stage assemblers, modifying them for non-binary-clean serial port behaviour, and that's what became portsh. So if you run py-remoteexec over portsh, you're actually running two levels of python remote script assembly, both of which are derived from sshuttle.

Confused yet? Don't fret. Exactly how it works isn't that important, unless you're amused by such things, in which case you're best to just view the portsh.py source. It's pretty readable, if you're crazy. But if you, like most people, don't care how it works, all you need to know is you clone my repository, and run portsh, and it gives you something like ssh-over-serial-port semantics. And because it uploads itself, all you need on the remote machine is python - you don't need to install any other tools.

Enjoy!

2012-07-06 »