Genoci and Lpack

Introduction

I’ve been working on a pair of tools for manipulating OCI images:

  • genoci, for GENerating OCI images, builds images according to a recipe in yaml format.
  • lpack, the layer unpacker, unpacks an OCI image’s layers onto either btrfs subvolumes or thinpool LVs.

See the README.md for both for more detailed usage.

The two can be used together to speed up genoci’s builds by reducing the number of root filesystem unpacks and repacks. (See genoci’s README.md for details)

Example

While the project’s readme’s give examples, here is a somewhat silly one just to give an idea. Copy the following into recipe.yaml:

cirros:
  base: empty
  expand: https://download.cirros-cloud.net/0.3.5/cirros-0.3.5-i386-lxc.tar.gz
weird:
  base: cirros
  pre: mount -t proc proc %ROOT%/proc
  post: umount %ROOT%/proc
  run: ps -ef > /processlist
  run: |
    cat > /usr/bin/startup << EOF
    #!/bin/sh
    echo "Starting up"
    nc -l -4 9999
    EOF
    chmod 755 /usr/bin/startup
  entrypoint: /usr/bin/startup

Then run “./genoci recipe.yaml”. You should end up with a directory “oci”, which you can interrogate with

$ umoci ls --layout oci
empty
cirros
cirros-2017-11-13_1
weird
weird-2017-11-13_1

You can unpack one of the containers with:

$ umoci unpack --image oci:weird
$ ls -l weird/rootfs/usr/bin/startup
-rwxr-xr-x 1 root root 43 Nov 13 04:27 weird/rootfs/usr/bin/startup

Upcoming

I’m about to begin the work to replace both with a single tool, written in golang, and based on an API exported by umoci.

Disclaimer

The opinions expressed in this blog are my own views and not those of Cisco.

Advertisements
Posted in Uncategorized | Tagged , | Leave a comment

CNI for LXC

It’s now possible to use CNI (container networking interface) with lxc. Here is an example. This requires some recent upstream patches, so for simplicity let’s use the lxc packages for zesty in ppa:serge-hallyn/atom. Setup a zesty host with that ppa, i.e.

sudo add-apt-repository ppa:serge-hallyn/atom
sudo add-apt-repository ppa:projectatomic/ppa
sudo apt update
sudo apt -y install lxc1 skopeo skopeo-containers jq

(To run the oci template below, you’ll also need to install git://github.com/openSUSE/umoci. Alternatively, you can use any standard container, the oci template is not strictly needed, just a nice point to make)

Next setup CNI configuration, i.e.

cat >> EOF | sudo tee /etc/lxc/simplebridge.cni
{
  "cniVersion": "0.3.1",
  "name": "simplenet",
  "type": "bridge",
  "bridge": "cnibr0",
  "isDefaultGateway": true,
  "forceAddress": false,
  "ipMasq": true,
  "hairpinMode": true,
  "ipam": {
    "type": "host-local",
    "subnet": "10.10.0.0/16"
  }
}
EOF

The way lxc will use CNI is to call out to it using a start-host hook, that is, a program (hook) which is called in the host namespaces right before the container starts. We create the hook using:

cat >> EOF | sudo tee /usr/share/lxc/hooks/cni
#!/bin/sh

CNIPATH=/usr/share/cni

CNI_COMMAND=ADD CNI_CONTAINERID=${LXC_NAME} CNI_NETNS=/proc/${LXC_PID}/ns/net CNI_IFNAME=eth0 CNI_PATH=${CNIPATH} ${CNIPATH}/bridge < /etc/lxc/simplebridge.cni
EOF

This tells the ‘bridge’ CNI program our container name and the network namespace in which the container is running, and sends it the contents of the configuration file which we wrote above.

Now create a container,

sudo lxc-create -t oci -n a1 -- -u docker://alpine

We need to edit the container configuration file, telling it to use our new hook,

sudo sed -i '/^lxc.net/d' /var/lib/lxc/a1/config
cat >> EOF | sudo tee -a /var/lib/lxc/a1/config
lxc.net.0.type = empty
lxc.hook.start-host = /usr/share/lxc/hooks/cni
EOF

Now we’re ready! Just start the container with

lxc-execute -n a1

and you’ll get a shell in the alpine container with networking configured.

Disclaimer

The opinions expressed in this blog are my own views and not those of Cisco.

Posted in Uncategorized | Tagged , | 2 Comments

Namespaced File Capabilities

Namespaced file capabilities

As of this past week, namespaced file capabilities are available in the upstream kernel. (Thanks to Eric Biederman for many review cycles and for the final pull request)

TL;DR

Some packages install binaries with file capabilities, and fail to install if you cannot set the file capabilities. Such packages could not be installed from inside a user namespace. With this feature, that problem is fixed.

Yay!

What are they?

POSIX capabilities are pieces of root’s privilege which can be individually used.

File capabilites are POSIX capability sets attached to files. When files with associated capabilities are executed, the resulting task may end up with privilege even if the calling user was unprivileged.

What’s the problem

In single-user-namespace days, POSIX capabilities were completely orthogonal to userids. You can be a non-root user with CAP_SYS_ADMIN, for instance. This can happen by starting as root, setting PR_SET_KEEPCAPS through prctl(2), and dropping the capabilities you don’t want and changing your uid.  Or, it can happen by a non-root user executing a file with file capabilities.  In order to append such a capability to a file, you require the CAP_SETFCAP capability.

User namespaces had several requirements, including:

  1. an unprivileged user should be able to create a user namespace
  2. root in a user namespace should be privileged against its resources
  3. root in a user namespace should be unprivileged against any resources which it does not own.

So in a post-user-namespace age, unprivileged user can “have privilege” with respect to files they own. However if we allow them to write a file capability on one of their files, then they can execute that file as an unprivileged user on the host, thereby gaining that privilege. This violates the third user namespace requirement, and is therefore not allowed.

Unfortunately – and fortunately – some software wants to be installed with file capabilities. On the one hand that is great, but on the other hand, if the package installer isn’t able to handle the failure to set file capabilities, then package installs are broken. This was the case for some common packages – for instance httpd on centos.

With namespaced file capabilities, file capabilities continue to be orthogonal with respect to userids mapped into the namespace. However they capabilities are tagged as belonging to the host uid mapped to the container’s root id (0).  (If uid 0 is not mapped, then file capabilities cannot be assigned)  This prevents the namespace owner from gaining privilege in a namespace against which they should not be privileged.

 

Disclaimer

The opinions expressed in this blog are my own views and not those of Cisco.

Posted in Uncategorized | 2 Comments

Containers micro-conference

The deadline for the CFP for the containers microconference at Plumber’s is coming up next week. See https://discuss.linuxcontainers.org/t/containers-micro-conference-at-linux-plumbers-2017/262 for more information

Posted in Uncategorized | Leave a comment

Outdoors laptop

i like to work outside, at a park, on the beach, etc. For years I’ve made do with regular laptops, but all those year’s I’ve really wanted an e-ink laptop to avoid the squinting and the headaches and the search for shade. The pixel-qi displays raised my hopes, but those were quickly dashed when they closed their doors. For a brief time there were two e-ink laptops for sale. They were quite underpowered and expensive, but more importantly they’re no longer around.

Maybe it’s time to build one. There are many ways one could go about it:

  • Get a toughbook with a transflective display
  • Get a rooted nook and run vncclient connected to a server on my laptop or in a vm
  • Get a dasung e-ink monitor connected to my laptop. Not cheap, and dubious linux support.
  • Actually it seems an external pixel-qi display may be available right now. Still pretty steep price.
  • Attach a keyboard to a nook and use that standalone
  • Get a used pixelqi, put it in some sort of case, and hook it up as a separate display
  • Get a small e-ink (2″) display, hook it up to a rpi or beaglebone black
  • Get a used pixelqi display and install it in something like a used lenovo s10-3
  • Get a freewrite and hack it to be an ssh terminal. Freewrite themselves don’t like that idea.
  • Get a used OLPG with pixelqi display.

So is there anyone in the community with similar goals? What are you using? How’s it working for you?

Posted in Uncategorized | 3 Comments

Whither cgmanager

A few years ago we started the cgmanager project to address two issues:

  1. Simplify code in callers.
  2. Support safe delegation of cgroups to nested container managers.

Historically, advice on where and how to mount cgroups was varied. As a result, programs which wanted to manipulate cgroups had quite a bit of work to do to find cgroup mountpoints for specific controllers and calculate the full paths to use. Cgmanager simplified this code by doing that work for callers. With the ‘cgm’ command, it was also greatly simplified for users. Today, with the advent of the cgroup-lite package, now the cgroupfs-mount package, and systemd, all of which agree on mounting each controller separately under /sys/fs/cgroup/$controller, container manager code can be greatly simplified. This can be evidenced by comparing the older ‘cgfs’ cgroup driver in lxc to the newer ‘cgfsng’ cgroup driver which benefits from assuming the simpler layout (falling back to the more complicated driver if needed).

The other core motivating factor for cgmanager, safer support of cgroup delegation, is now – at last! – deprecated by the availability of cgroup namespaces.

A few lessons?

When starting actual coding for cgmanager, one open question was what kind of communication it should support. We decided on using a dbus interface, implemented using libnih-dbus. I had strong reservations about that. In retrospect, it worked better at the time than I had expected, but had some severe problems. First, performance was pretty horrific. Every dbus connection required 4 round trips to get started. For the simple upstart based systems of the time this was ok, and we kept it from becoming worse by having a cgmanager proxy in a multiply-nested container talk straight to the host cgmanager. However, as Ubuntu switched to systemd, which made very heavy use of cgroups, we started seeing huge performance impacts. It also failed to satisfy the requirements of google folks who otherwise may have been more interested.

Secondly, as Ubuntu switched from systemd to upstart, and upstart – and libnih – became unsupported, this started affecting cgmanager.

So, if cgmanager were still needed today, I would strongly consider rewriting it to have a simple interface over a unix socket. However, as described above, cgmanager has happily become unnecessary. A different sort of cgroup manager may in fact be needed – a modern day ‘libcgroup’ for simple administration of service resources on systems without systemd. However, that is not cgmanager’s role.

So, with that, the cgmanager project is effectively closed. It will continue to be supported by the lxc project (us) on legacy systems. But on the whole, migrating systems to a kernel supporting cgroup namespaces (or at least lxcfs-provided cgroup filesystems) is highly recommended.

PS

By the same token, the cgroup part of lxcfs is also effectively deprecated. I will probably move it under a deprecation build flag soon. lxcfs will continue to be used for procfs virtualization, and likely expanded to support some /sys information.

Posted in Uncategorized | Leave a comment

Two worthwhile books

This past summer, we went to the local bookstore and picked up a copy of Jason R. Briggs’ “Python Programming for Kids” (No Starch Press). I had been looking for a fun kids’ programming book for a while, and decided on this one. My then 8 year old basically worked through the book alone, slowly following the recipe toward building the game which is the books’ climax. I had wanted to do a detailed review of this book, but it recently had an unfortunate encounter with a whole raw chicken, so that may not happen. Suffice it to say I think it’s a great book for a child to work through.

(On a side note, there is a free program called ‘laby’, where kids program a robot ant to escape a labyrinth in several languages, which was great fun for the kids. Heck – it’s fun for adults.)

More recently, I was sent a review copy of ‘The Car Hacker’s Handbook” (also No Starch Press) by Craig Smith. I was excited about this one, as the subject matter is both fascinating and disturbing. New cars have some great features, but the fact that I can (for instance) click a link on a google map to send directions to a car is a bit disconcerting. So, as Chris Evans says in the prologue,

“We’re all safer when the systems we depend upon are inspectable, auditable, and documented – and this definitely includes cars.”

This book starts by teaching about threat models. It goes into great detail describing various in-car networks, as these are a gateway to inspecting, modifying, and perhaps subverting the vehicle’s systems. It gives details about tools to retrieve diagnostic info, modify the ECU programming, and listen in on TPMS systems. It goes over cracked keyless entry systems. It (I think rightly) defers details about disassembling existing programs to other texts, but shows how to write a weaponized exploit. Overall, a wonderful – and motivating – book.

Posted in Uncategorized | 2 Comments