Lxc is great for starting up several containers on your laptop or on an ec2 host. But what if you want to fire up containers on multiple ec2 instances, and have them talk to each other?
An easy way to support that is using openvswitch. This script is a user-data-script which you can use to fire up instances ready to connect containers. For instance, I personally would
ami=`ubuntu-cloudimg-query precise` ec2-run-instances -n 2 -f user-data-lxc-ovs.sh -k mykeypair $ami
This will fire off two Ubuntu precise instances which will run the script. Once the scripts are done (sudo status cloud-final will show stopped), you can look at the openvswitch bridge with
sudo ovs-vsctl show
You want to connect the bridges on each instance by adding a GRE tunnel. On each host, do
sudo ovs-vsctl add-port br0 gre0 -- set interface gre0 type=gre options:remote_ip=x.x.x.x
where x.x.x.x is the public ip address of the other instance. Now the tunnel is set up. You can simply fire up container p1 in each instance
sudo lxc-start -n p1
Check out the /etc/lxc/lxc-ovs.conf file on each instance, which was the lxc configuration used to create the containers. It has two network sections (each started by lxc.network.type=veth). The first will be veth0, and will be connected to the lxcbr0 to connect the container to the internet. The second will be veth1, which is bridged with the openvswitch GRE tunnel. So the containers can ssh to each other’s veth1 addresses.
Pingback: Connecting containers on several hosts with Open vSwitch « thoughts…
Hi,
There is one point I don’t understand. Are you running ovs-switch inside the containers also ? where is the br0 instance initiated ? on the host or in the container ?
Cheers,
Dimitris
br0 is on the host. the container just has two nic’s, eth0 and eth1, which are endpoints of two separate veth tunnels. the other endpoint of the one tunnel is connected to the ovs bridge, the other to the lxcbr0 bridge.
Thanks for the walk-through. I’m successfully able to communicate between LXC containers running on two VMs.
Any advice for further reading – I’m trying to figure out how one would add a 3rd VM.
Sorry, the references seem pretty scarce out there. I’m pretty sure to add a 3d VM you need to add all the links – in other words, the number of links would scale pretty badly. 2 links for 2 VMS, 6 for 3 VMs, 12 for 4…
Please do let me know if you find some good documentation.
While you can add all the links (and it *might* help the controller determine more efficient routes) – I don’t think you have to add links between every pair. (the math from before)
I’m pretty certain as long as a path between hosts exists, you can tranverse it even if there are multiple hops.. example time:
launch 3 VMs. In each VM do everything except creating the GRE tunnels. Let’s name our VMs A,B,C.
Then:
* create a single GRE tunnel on A with the remote_ip of B
* create a single GRE tunnel on C with the remote_ip of B
* create two GRE tunnels on B with remote_ip of A and C.
Then you should be able to lxc-console into the containers and ping between all pairs. Any packets going from container A to container C will have to travel through B, which on my setup means roughly double ping response time.
Thanks, Jesse, good to know!
I have a question suppose, I start a lxc container with a certain configuration. I then shut it down and I want to start it with a different configuration. For example, in this post, we are assigning a ip address based on the ami index. Suppose I want to change it later. How do I do it? I tried starting the container using lxc-start with the f option but it didn’t work.
What do you mean by it didn’t work? I’d suggest sending an email with precisely what you did and what happened to the lxc-users mailing list, as what you want certainly should be possible.
hey although i know that the script is for just testing purposes this would be a great addition
Add the down line here (should be good beyong lxc 0.8.04 or something)
“”"
lxc.network.script.up=/etc/lxc/ovsup
lxc.network.script.down=/etc/lxc/ovsdown
“”"
and the script
“”"
cat > /etc/lxc/ovsdown << EOF
#!/bin/bash
ifconfig \$5 0.0.0.0 down
ovs-vsctl del-port br0 \$5
EOF
"""
Hey, thanks for your comment. Now, by default, since the one veth is in the container’s namespace, what should happen is the veth pair should disappear when the container goes down. What you add could be worth adding just as instruction in case people want to add something else, and it won’t harm anything, but am I overlooking some reason why the particular lxc.network.script.down you have there would be needed?
Have a look at your ovs-vsctl , the ports you add all pile up on that bridge , it would be generally a good practice to clean up those ports from the bridge , esp if you plan to use it for more than one container instance The bridge is persistent across reboots (both lxc and container) every time you boot an lxc a virtual port with the same name of the interface “veth” gets added , these don’t go away on their own (ovs-vsctl usus a database to store all these routing ). Luckily for us these 6 random digits give us a fairly big operational space but still it could cause a random clash the next time you boot and bad luck gives you a formerly used port .(very small , esp if you decide to take down the bridge and recreate it every time ). This becomes even more relevant in case you have dynamic environments and would want to use gre ports with new ip addresses without building up a huge number of grexxx ports.
Thanks – if that’s true that the virtual port doesn’t go away, then yeah it definately should be cleaned up. I’ll have to test (but can’t right now) – thanks!
Hi,
I am working in a cluster and I used OVS (1.10.0) to interconnect 10 physical nodes. In each node a did the folowing steps :
I create an ovs bridge.
I configure 9 gre tunnel for the rest of node.
I create my 10 containers and attatch them to the ovs bridge.
The problem, is I cannot ping VMs and even I perceive that hosts become slow and respond hardly in ping. However, when I teste his with just 2 physical node it works well.
I think the problem is the number of gre interfaces that causes this troubleshoot.
Please, is there a solution ?