This post got delayed a bit due to a few unexpected complications. First, it turns out that you cannot connect GRE tunnels in Amazon’s EC2 over the instances’ private addresses. You must use the public addresses. Second, quantal removed the openvswitch-datapath-dkms package because the openvswitch kernel module is now available upstream. However it turns out that the upstream openvswitch module does not yet provide GRE tunnels configurable through the db. Therefore hopefully the openvswitch-datapath-dkms package will soon be reintroduced, but meanwhile we will use it from the inestimable James Page’s “junk” ppa.
Oh, but first things second. What are we doing today? We’re going to use juju to fire off a set of lxc compute nodes, pre-populated with LVM backed pristine containers which can be very quickly cloned, and which will be able to communicate over an openvswitch private network no matter which compute node hosts them.
My use case for this is to set up for a long varied bug triage and replication session. It takes about 10-20 minutes (much longer on amazon, but setting a local mirror in /etc/default/lxc should speed that up there) to initially set up, after which starting a new container takes about 3 seconds.
There are two bzr trees involved. The actual juju charm is at lp:~serge-hallyn/charms/quantal/ovs-lxc/trunk. It relates one master compute node to any number of slave nodes. The master node will be used just as the slave ones, but is set apart to be the central openvswitch hub. So every slave will have a GRE tunnel to the master, and slaves can talk to each other over two GRE links (through the master). (You’ll want to check this out under ~/charms/quantal, i.e. “mkdir -p ~/charms/quantal; cd ~/charms/quantal; bzr branch lp:~serge-hallyn/charms/quantal/ovs-lxc/trunk ovs-lxc;”)
The other bzr tree is lp:~serge-hallyn/+junk/jujulxcscripts. The first script here is ‘juju-deploy-lxc’, which accepts a number of slaves to start, boostraps juju, deploys the nodes, and relates each slave to the master. It finally runs ‘grabnodes’ which will gather information used by the other scripts.
Next, ‘startcontainer’ will clone and start a new container. It rotates round robin among the master and slaves each time it is invoked. With no arguments it will start an amd64 quantal container. It can also be called as
startcontainer quantal i386
for the obvious result.
Finally, ‘sshcontainer (n)’ will ssh into the (n)th container you’ve started, starting with 0. The scripts don’t get too fancy or try to do too much – if you want much more, you might actually want to deploy openstack 🙂
I do hope at some point to expand this so as to use a (juju-deployed) ceph cluster for the container backing store. It is not as flexible as it ought to be, as it expects /dev/vdb or /dev/xvdb to be a spare drive and mounted on /mnt at instance startup, but this is good enough to work for me on Amazon ec2 as well as an openstack based cloud, which is all I need to make this useful for myself.
It won’t work by default on a local (lxc-backed) juju config, but I will play with that as an exercise to investigate what sorts of site customizations we should support in juju-lxc. In particular, we’ll need to (a) be able to use lxc mount hooks (so cgroups can be mounted in the container) and custom apparmor profiles.