Synergy

30 july 2009

Concept

I bet anyone who used more than one machine at a time faced the problem of syncing settings/data between 'em. What I've missed most is IM and browser history, contacts, bookmarks and settings.

Available solutions seemed quite scarce and incomplete: sure, some firefox plugin can sync it's stuff, opera has built-in service for that, but what about other software? Besides, it seems quite inpractical to have syncer for every app (which usually doesn't sync all the stuff, anyway) when all you actually need is rsync of a few chosen paths over ssh.

Another problem is that you forget to sync, resulting in a bits of history or other data written here and there, which can't usually be synchronised by some coventional means like aforementioned rsync, since it's written to log-file, sqlite db or even some binary format, so even VCS won't be of any use for merging this stuff.

But, there are bright sides. *nix apps write their stuff in nice and compact form into home, without littering all over fs, polluting some registry etc, like old windoze apps seem to do, so all you really need to sync that stuff is to transfer one single path, and with always-available ssh that's a piece of cake. And the not-to-forget issue can be solved by mangling the apps' binaries - so when you type in "firefox" you get a sync, then firefox.

Operation

Python, as always. Configuration is YAML:

nodes:
 - coercion
 - sacrilege

path_ext: ~/bin

control:
 firefox: ~/.mozilla
 opera: ~/.opera
 gajim: ~/.gajim
 claws:
  bin: claws-mail
  paths:
   - ~/.clawz
   - ~/.claws-mail
 audacious:
  bin: audacious2
  paths: ~/.config/audacious
 secure:
  bin:
  paths:
   ~/media/secure
   ~/.signature
		

Here nodes are hostnames of all the synced machines involved.

path_ext is a list (or single string, as in my example) of extensions to PATH var, which I found necessary for non-interactive ssh logins, since it fails to apply default shell rc-settings, dunno why, besides, these stupid PATH breaks far too often, so it's easier to specify it explicitly then depend on a shell invocation.

And then there's control section, with all the synchronized apps. Each app can be specified just by name:path pair (like firefox, opera or gajim here) or with full-fledged mapping with binary (bin) and list of to-be rsync'ed paths (paths). If short syntax is used, binary is determined from path (by which <binary> invocation), so it's usually sufficient for simple apps. If bin-path is explicitly specified as empty (like in secure section), no binaries will be faked for this paths.

Usual mode of operation is fairly simple:

  • First of all, invocation: script can be started either by it's real path or any of specified in control section paths, by the means of a symlink in place of a binary (like /usr/bin/firefox). That determines how script will act in the end - either it'll just exit or exec original binary (starting a firefox, for my example).
  • Then app checks it's status, which can be one of the following:
    • master - this node has the latest data, all the binaries are real (e.g. /usr/bin/firefox is not same file as realpath of sys.argv[0]). Note that status is actually determined by checking whether all the binaries' paths are same file as the script (being it's symlinks) or none of them are. There can be only one master.
    • minion - any normal non-master node with faked binaries, for example (with my configuration): /usr/bin/firefox is a symlink to this script. Note that to be considered minion, node must have all the specified binaries faked.
    • undefined - that's the state when some binaries, specified in control section, is faked and some are not. Script won't perform any sync-ops with this state on any of the nodes, including itself. It shouldn't happen at all, but sometimes does, when software update wipes out symlinks, for example, so user intervention is necessary to explicitly specify master node (actually by marking all the minion nodes as such).
    • conflict - one of the apps specified is running on the node, so it's data might be in the inconsistent state. Means pretty much the same thing as undefined, might happen on the master node and can be fixed by simple pkill (although firefox would consider it a crash).
  • After that, it will ssh-link to every other node in the list (self is determined by hostname), launching synergy -s on the remote end, getting the node status to determine which one is master and ensure that the rest are minions.
  • Then it's simple rsync over ssh (few of them in parallel, actually) of all the specified paths from master (invocation is hard-coded as "rsync -HaAxXz --delete"), followed by replacing all the binaries' symlinks with real ones (backed-up with ".syn_bak" suffix), making it a master node.
  • One more ssh to the master node, with synergy -l command, which instructs it to fake all the binaries, effectively making it a minion node.
  • And the start of the real app, if script was invoked via symlink in place of one of the in binaries. Something like firefox appears... ta da! ;)

Even with wireless link, it's usually fast, since rsync is damn good at transferring only the changed bits, and that with compression as well.

Ssh might require a password thou, so I use this nice all-around pinentry tool to prompt it. Pinentry might do it with GTK, QT3/4 or curses-based user-friendly interface, depending on how it's built and whether X is running or not. Of course, password for each node is cached after fist prompt 'till the end of the process.

Another thing with ssh is hosts' public key check, which shouldn't fail, or synergy will crash with a warning about that, so it won't send a password to faked host, but it also means that all the nodes' public keys should be known to local ssh.

In case of bogus half-faked state, use --lock option to fake all non-faked binaries. Note that the script will never remove any backed-up binary that might be already in place.

And, obviously, you need either permissions to access binaries' paths or suid bit (which is quite the same thing) to fake 'em, so I use the script with python suid-wrapper. It should be okay with all the checks wrapper performs, but I plan to improve it with POSIX Capabilities someday.

And you can make the app give the full report on every action with --debug flag.

Other (useful) flags are (--help output):

Usage: synergy [options]

Sync application paths between several nodes

Options:
  -h, --help     show this help message and exit
  -l, --lock     lock current node
  -s, --status   get node status (master / minion)
  -q, --query    show status of all nodes
  -p, --prepare  pull changes from master, but dont become one
  --debug        print lots of debug info
		

And with that, I'm happy user of two fully-synchronized laptops, never having to worry about this stuff, which is great (trust me on that).

SH rewrite

At some later point I felt that the above script is a bit too complicated for the task and does some unnecessary work, so I rewrote the thing in sh with configuration embedded right into script:

## Sync with:
# coercion

## Synced paths:
# firefox: ~/.mozilla
# opera: ~/.opera
# claws-mail: ~/.clawz ~/.claws-mail
# skype: ~/.Skype
# misc: ~/media/secure ~/.signature
		

It's really slim, simple and does all the same tricks and checks, check it out here.

Links

Code

Deps

python
I use 2.6, but it should be 3.1-compatible; haven't tested with older ones
pexpect
used to interact with ssh and rsync
psutil
to check whether stuff is running (for conflict state)
PyYAML
great format to store configuration data
pinentry
great environment-friendly tool to prompt for password; usually comes with GnuPG
rsync
used to efficiently transfer changes between nodes
OpenSSH
transport layer for both script data and rsync