I’ve mentioned user namespaces here before, and shown how to play a bit with them. When a task is cloned into a new user namespace, the uids in the namespace can be mapped (1-1, in blocks) to uids on the host – for instance uid 0 in the container could be uid 100000 on the host. The uids are translated at the kernel-userspace boundary (i.e. stat, etc), and capabilities for a namespaced task are only valid against objects owned by that namespace. The result is that root in a container is unprivileged on the host.
Eric has been making great progress in moving the kernel functionality upstream. With the newest 3.7 based ubuntu kernel, plus a few of his not yet merged patches, a milestone has been reached – it’s now possible to run a full ubuntu container in a user namespace!
First start up a fresh, uptodate quantal vm or instance. Install my user namespace ppa, install the kernel and nsexec packages from there, create a container, and convert it to be namespaced:
sudo add-apt-repository ppa:serge-hallyn/userns-natty
sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install linux-image-3.7.0-0-generic nsexec lxc
sudo lxc-create -t ubuntu -n q1
sudo container-userns-convert q1 100000
The ‘container-userns-convert’ script just shifts the user and group ids of file owners in the container rootfs, and adds two lines to the container configuration file to tell lxc to clone the new user namespace and set up the uid/gid mappings.
Now you can start the container,
sudo lxc-start -n q1 -d
sudo lxc-console -n q1
Look around the container, sudo bash; notice that it looks like a normal system, with ubuntu as uid 1000, root as uid 0. But look from the host, and you see root tasks in the container are actually running as uid 100000, and ubuntu ones as uid 100000.
There are a few oddnesses (you can sudo on ttys 1-4, but sometimes it fails on /dev/console, and shutdown in the container does not kill init); the lxc package needs a few more changes (the cgroup setup needs to be moved to the container parent); and plenty of things are not yet allowed by the kernel (mounting an ext4 filesystem).
But this is a full Ubuntu image, confined by a private user namespace!
After working out some kinks, we’ll next want to look into container startup by unprivileged users.