Dan Walsh's Blog

Got SELinux?

Previous Entry Add to Memories Share Next Entry
New Security Feature in Fedora 18 Part 7: Secure Linux Containers

Secure Linux Containers

In Fedora 18 we have enhanced the libvirt-sandbox package to allow for easy creation of Secure Containers.

Containers are a form of isolating one or more processes from the rest of the system.  Some times containers are described as lightweight virtualization.  Containers are really just a userspace concept.  The Linux kernel has no concept of a container.  The kernel implements namespaces and cgroups.  Userspace tools can combine these kernel services into a "container".


Namespaces are a way of changing a processes view of its environment from its parents processes.  For example the file system namespace allows me to change a processes view of the file system hierarchy.  pam_namespace introduced way back in Fedora 6/RHEL5, allowed a login program to create a namespace and mount file systems that would not be seen by the ancestor processes.  Meaning I could have multiple processes with different /tmp directories and multiple home directories mounted on /home/dwalsh.

The kernel currently implements 5 name spaces.

  1. mount - mounting  and unmounting filesystems will not affect rest of the system 
  2. UTS - setting hostname, domainname will not affect rest of the system
  3. IPC - process will have independent namespace for System V message queues, semaphore sets and shared memory segments
  4. network - process will have independent IPv4 and IPv6 stacks, IP routing tables, firewall rules, the /proc/net  and  /sys/class/net  directory trees, sockets etc.
  5. pid - processes have an independent pids from the rest of the system.  Each namespace can have its own pid 1. 
Note: A UID namespace is being developed, but is not ready to be used yet, and I have some concerns about how well this will work. Our tools do not currently use the UID namespace.

pam_namespace, sandbox -X, unshare, systemd allow allow you to take advantage of namespaces.


Wikipedia describes cgroups as:

  cgroups (control groups) is a Linux kernel feature to limit, account and isolate resource usage (CPU, memory, disk I/O, etc.) of process groups.

Basically you can use cgroups to control the amount of resources a process or groups of processes can get on a system. 
I put together a little screen-cast of cgroups to demonstrate their power.

Tools like LXC have existed for a while to allow users to create containers but the tool set is at a very low level


"Libvirt is a C toolkit to interact with the virtualization capabilities of recent versions of Linux (and other OSes). The main package includes the libvirtd server exporting the virtualization support."

libvirt-lxc was introduced in Fedora 16. It enhanced the libvirt API to allow users to build containers using libvirt.  This allows you to manage your kvm/qemu virtualization along with your linux containers, all within the same framework.  The only problem, is setting up a linux container using the libvirt api is fairly difficult.


Dan Berrange created a new package called libvirt-sandbox in Fedora 17.  The libvirt-sandbox package provides an application development library (libvirt-sandbox) to facilitate the embedding of virtualization into applications.  One of the main advantages of this new tool set, was that it greatly simplified the API for creating virtual machines and containers.


Using containers by itself does not give you good security separation.  The reason for this is kernel file systems like /proc, /sys, cgroupsfs and selinuxfs are not containerized.  A privileged process running within a container can affect other processes running outside of the container or processes running in other containers.  In libvirt-sandbox and libvirt-lxc you can use SELinux Labelling to further lock down privileged processes, for example preventing mounting of random file systems or stopping processes from disabling SELinux. 


Dan Berrange and I have been working to enhance libvirt-sandbox.  We have added a command line tool called virt-sandbox-service which allows a user to easily create an application sandbox.  virt-sandbox-service allows an administrator to run multiple services on the same machine each service in a secure Linux Container.   Some major features of virt-sandbox-service containers.

  • Use systemd within the container as the init processes.
  • Uses standard unit files for starting and stopping containerized applications.
  • Shares the /usr partition, meaning if you are running hundreds of Apache containers, and update Apache code, each container will instantly use the new version of Apache.
  • Uses SELinux MCS Labelling to separate each container, preventing even root processes from interfering with the host or other containers.
The goal of this tool is not to allow general purpose applications to run within the container, although we will work to get most services to be able to run.  The tool is not goaled at running full OS chroot, but more towards particular applications.

I have done preliminary tests on running.  httpd, mysql, postgresql, dovecot within these containers.  I am hoping people begin to play with the tool and help us expand which applications can run within the container.  Also you can run multiple applications within a container at the same time.  For example, I have tested httpd and mysql running within the same container.

How to use:

# yum install libvirt-sandbox httpd
There is a bug in the tool right now where it will not work without an /selinux file.
# touch /selinux

Use the virt-sandbox-service command to create a container.

virt-sandbox-service create -C -l s0:c1,c2 -u httpd.service container1
Created sandbox container dir /var/lib/libvirt/filesystems/container1
Created sandbox config /etc/libvirt-sandbox/services/container1.sandbox
Created unit file /etc/systemd/system/container1_sandbox.service

Manipulate the data within the container while running outside of the container.

cd /var/lib/libvirt/filesystems/container1/var/log
touch content
ls -lZ content
# Make sure the content gets created with the correct MCS label.
# Content should be labeled with s0:c1,c2 : Not s0
Now create a file with a bad label for the container.
cat "Secret" > badcontent
chcon -l s0:c3,c4 badcontent

Start the container:

virt-sandbox-service start container1

In another window

Make sure the processes are running with the proper SELinux label. ps -eZ | grep svirt_lxc You should see processes like systemd, systemd-journal, dhclient and httpd running within the container with the MCS label of s0:c1,c2

Connect to the container

virt-sandbox-service connect container1
getenforce   # Should tell you SELinux is disabled.
setenforce 1 # Should be denied
touch /file  # Should deny you creating this file
touch /var/www/html/content  # Should be allowed
cat /var/www/html/badcontent # Should be denied
Configure the apache server any way you would like, and manipulate html pages
ifconfig eth0  # Grap IP Address for use on next test
# Use the shell running with in the container to attempt to break out of the container. 

On your hosts Firefox use the IP within the container

firefox $IP # Using IP address from container, make sure you see the content.

Shut down the container
virt-sandbox-service stop container1

Now lets try to do the same but starting and stopping the container using systemctl commands
systemctl start container1_sandbox.service
systemctl enable container1_sandbox.service # Check on reboot if the container is running

Make sure the container is running.

virt-sandbox-service connect container1
ps -eZ

I would like to hear what you think?  What enhancements you would like to see?  What 
applications would you like to see run within the containers.  

Since this is a first version, we think there could be some growing pains, so use at your own 
risk, but we would love to work with the community to improve this tool set.

No HTML allowed in subject


(will be screened)

You are viewing danwalsh