Log in

No account? Create an account

Previous Entry Share Next Entry
Secure Virtualization Using SELinux (sVirt)

Next week I will be at the Red Hat Summit talking about SELinux, specifically sVirt,  Secure Virtualization.

While virtualization seems to be next big thing, providing great opportunities in resource allocation, system management, savings on power and cooling, and the ability to grow and shrink resources depending on demand. 

But what about the security? 

What happens when a cracker breaks into a virtual machine and takes it over?  What happens if there is a bug in the hypervisor? 

Before virtualization, we had isolated servers.  A cracker taking over one server meant that he controlled just that server. The cracker would then have to launch network attacks against other servers in the environment.  System administrators had lots of tools to defend against network attacks on machines: firewalls, network traffic analysis tools, intrusion detection tools, etc. 

After virtualization, we have multiple services running on the same host.  If a virtual machine is broken into, the cracker just needs to break though the hypervisor.  If a hypervisor vulnerability exists, the cracker can take over all of the virtual machines on the host.  He can even write into any virtual host images that are accessible from the host machine. 

This is very scary stuff. The question is not "if", but "when".  Hacker/cracker conventions are already examining hypervisor vulnerabilities.  Crackers have already broken though the xen hypervisor, as I documented in one of my previous blogs.

Now let's examine libvirtd/qemu/kvm in Fedora 11.

libvirtd starts all virtual machines.  All virtual machines run as separate processes.  Virtual images are stored as files or devices like logical volumes and iscsi targets. 


What is SELinux really good at?

It is great at labeling processes, files, and devices.  It is great at defining rules on how labeled processes interact with labeled processes, files, and devices.

Seems like a nice match.  SELinux can be used to mitigate the problems of a vulnerability in the hypervisor. 

But, you ask, "Didn't we do this in Red Hat Enterprise Linux 5?"  Yes, but we were still vulnerable to the Xen breakout.

If you read the Xen vulnerability document, it explains the mechanism used to thwart SELinux protection in RHEL5.  The cracker realized that the xen process, labeled xend_t, was allowed to read/write all fixed disks labeled fixed_disk_device_t.  This allowed the cracker to break out of  the SELinux confinement by writing to the physical disk.  When I was writing policy for Xen in RHEL5, I had initially required the administrator to label volume xen image devices as xen_image_t.  The xen developers thought this was too difficult for the administrators to have to manage, and would cause too many failures.  We ran out of time to make the management tool do this automatically.  It was decided that usability was more important then security in this instance, and I had to allow this access.  I won't make that mistake again.

In Fedora 11, James Morris, Daniel Berrange, myself and others  have added SELinux support to libvirt, in the form of sVirt.  We added a security plug-in architecture to libvirt that defaults to SELinux protection.  Theoretically you can use other security architectures.  libvirt dynamically labels the image files and starts the virtual machines with the correct labels.  This allows us to avoid the problem of the administrator having to remember to set the correct label on the image files and devices.  By default all virtual machines in F11 get labeled with the svirt_t type and all image files get the svirt_image_t type.

SELinux policy has rules that allow the svirt_t processes to read/write svirt_image_t files and devices.

This protection allows us to protect the host machine from any of its virtual machines.  A virtual machine will only be able to interact with the files and devices with the correct labels.  A compromised virtual machine would not be allowed to read my home directory, for example, even if the virtual machine is running as root.
However, this "type" protection does not prevent one virtual machine from attacking another virtual machine.  We needed a way to label the domains and the image files with the same TYPES, but at the same time, stop virtual machine 1, running as svirt_t, attacking virtual machine 2,  which would also be running as svirt_t.   

Multi Category Security (MCS) to the rescue!

When we developed RHEL5 we added Multi Level Security (MLS) support.   This involved adding a fourth field to the SELinux context.

Originally in RHEL4 the SELinux context consisted of three fields ("USER:ROLE:TYPE").  In RHEL5 the SELinux context consists of four fields ("USER:ROLE:TYPE:MLS").  For example, files in the home directory could be labeled "system_u:system_r:user_home_t:TopSecretRecipe".  The MLS labels define a sensitivity level (s0-s15) and category of the data (c0.c1023), TopSecretRecipe in the example above is a human-readable translation of a field like s15:c0.c36.  The MLS label allows MLS machines to not only label the file based on its use, user_home_t in this example, but also by the sensitivity and nature of its content, "TopSecretRecipe".

This field was only used in MLS policy.  We attempted to make use of it in our default policy ("targeted"), by only defining a single sensitivity level ("s0") and allowing administrators to define categories.  We called this Multi Category Security (MCS).  The goal was to allow administrators and users to label their files based on the nature of their contents.  For example, system_u:object_r:database_t:PatientRecord could be a database which contained patient records.  For multiple reasons, MCS has not been  widely used.  I believe you are still better off defining a new SELinux type patient_record_database_t -- MCS does not afford the richness of access control that you can express with standard SELinux types. 

When we were developing sVirt, though, we realized that we could use MCS to provide us separation between two virtual machines running with the same SELinux type, svirt_t.  We designed libvirt to assign a different randomly-selected MCS label to each virtual machine and its associated virtual image.  libvirt guarantees that the MCS fields it selects are unique.  SELinux prevents different virtual machines  running with different MCS fields from interacting with each other or any of their content.

For example, libvirt creates two virtual machines with these labels:

NameVirtual Machine Process labelVirtual Machine Image Label
Virtual Machine 1system_u:system_r:svirt_t:s0:c0,c10system_u:object_r:svirt_image_t:s0:c0,c1
Virtual Machine 2system_u:system_r:svirt_t:s0:c101,c230system_u:object_r:svirt_image_t:s0:c101,c230

SELinux prevents virtual machine 1 (system_u:system_r:svirt_t:s0:c0,c10) from accessing virtual machine 2's image file (system_u:object_r:svirt_image_t:s0:c101,c230) -- the virtual machines can not attack each other.

These are the labels libvirt assigns.

NameSELinux ContextDescription
Virtual Machine Processessystem_u:system_r:svirt_t:MCS1MCS1 is a randomly selected MCS field.  Currently we support ~500,000 labels.
Virtual Machine Imagesystem_u:object_r:svirt_image_t:MCS1Only svirt_t processes with the same MCS fields are able to read/write these image files and devices.
Virtual Machine Shared Read/Write contentsystem_u:object_r:svirt_image_t:s0All svirt_t processes are allowed to write to the svirt_image_t:s0 files and devices.
Virtual Machine Shared Shared Read Only contentsystem_u:object_r:svirt_content_t:s0All svirt_t processes are able to read files/devices with this label.
Virtual Machine imagessystem_u:object_r:virt_content_t:s0When a virtual machine exits, its image file is relabeled to the system default, which usually is virt_content_t:s0,  No svirt_t virtual processes are allowed to read files/devices with this label.

We also added the ability to do static labeling to sVirt.  Static labels allow the administrator to select a particular label, including the MCS/MLS field,  for a virtual machine.  The virtual machine will always be started with that label.  Administrator who run static virtual machines are responsible for setting the correct label on the image files.  libvirt will never modify the label of a statically-labelled virtual machine's content. This allows the sVirt component to run in an MLS environment.  You can run multiple virtual machines on a libvirt system at different sensitivity levels.

  • 1
The apache module recently put into Fedora does something similar with categories for vhosts. It can be similarly used to confine vhosts into their document root.

Question for you:

Is there a way to tell from inside a virtual machine that the host has SVirt enabled?

For example. If I'm paying for utility computing services..and I want the protection of SVirt when shopping for a hosting solution...can I verify that SVirt protection is in place or will I need to take my provider's word for it?


Being able to interrogate the host is what we are trying to stop. You would have to find a vulnerability to know for sure. :^(

From a utility computer(or is it "infrastructure as a server" now) customer point of view... its a catch-22. It's protection that I can only be sure is enabled by proactively attempting to breach it. Somehow I doubt providers will look kindly on that sort of quality of service verification activity from their customers. I guess the fallback here is customers can start demanding some sort of breach of contract penalty if hosting companies have misconfigured their SVirt related policy. And then providers would have a monetary incentive to make sure SVirt protections are working.


Not enough customers are in the know to realize they should be demanding such features. I think enough people have to start offering it first, or it needs to be legislated (which i'm not a fan of), so that consumers are used to expecting it.

The understanding of the risks here is still evolving. I don't expect this sort of feature to be a bullet point for hosting providers soon...not till RHEL6/Centos6 come with this out of the box and providers gain some experience with setting this up on their production iron and feel confident to advertise it as a differentiator. I expect Red Hat to talk this issue up as part of the marketting in RHEL6. In the meantime, providers are going to need to dabble with the capability of the SVirt enabled tools in Fedora 12.

My point is.. for those customers who are aware...and do go shopping for this sort of protection..its difficult for them to hold their vendors feet to the fire in terms of quality assurance. There maybe a business opportunity here for Red Hat to certify hosting vendors as SVirt managed since I can't easily verify it myself as a customer.


The interface of pola-run sounds nicer.

For example replacing:
qemu -cdrom cd.iso -hda myhdd.img
pola-run -B --prog=qemu -a=-cdrom -fa=cd.iso -a=-hda -faw=hdd.img
means that qemu can only read from cd.iso, read and write to hdd.img and can read /usr etc. I find the interface of pola-run very convenient when I want to safely run a command on some untrusted input. It is a pity that it hasn't been ported to recent versions of glibc.

Why can't standard Unix accounts provide the isolation of virtual machines from each other that you're going for? SELinux only adds restrictions to that, so it would seem sufficient to give each virtual machine its own account and make the disk images mode 600, owned by that uid. I guess SELinux could help out by preventing the virtual machine from calling chmod() on those files but I don't see what else it buys you here.

DAC protections would be helpful also

I am not sure that qemu can currently run without at least running with certain capabilities.

According to SELinux policy virtual domains required these capabilities.

allow virt_domain self:capability { kill dac_read_search dac_override };

I guess qemu could be reworked to not need these capabilities and you could run them as a non priv user. Then you could give libvirt a group of UID's to run each virtual instance under. I think a sVirt security plugin could be written to do this.

However this would not give you the fine grain control that SELinux can give you.
For example if you had a database on your system that was world readable then the virtual instances would be able to read it, and the DAC permissions would not stop it. Similarly world write, chmod as you said. Executing setuid applications, all would be allowed. Etc.

I guess you could bring this up on the virt mailing list for discussion on potential security module solution.

Loved your blog, but i am not able to figure out the difference between the MCS and MLS. I mean libVirt is using MLS labels or MCS, and what difference it will make if the VM is Label with MCS and MLS. Thanks

Well if you are running without mls policy you will be using MCS labeling.

If you have not configured anyting in libvirt for static labeling, then you are using MCS. If you want to use MLS you need to install selinux-policy-mls, then modify the /etc/selinux/config, relabel, reboot, and you are almost there. But it will probably not be a pleasant experience.

Thank You for your reply

Sorry i saw this comment after a long time. But i appreciate your reply. The difference is a lot clearer to me now.

Just wondering currently the whole fedora 20 file system has SElinux enabled with user_home_t:s0:c0,c1023 and when i boot a qemu Virtual Machine and it will have svit_t s0:c66,c350 and svirt_image_t:s0:c66,c350. So if i just boot a normal qemu process without libvirt by mistake i use the same hda wich has svirt_image_t:s0:c66,c350 it is able to access that image file fully, i am able to read and write on the hda and the previous qemu libvirt process(s0:c66,c350) is able to read it but unable to write on. Isn't the libvirt(sVirt) process supposed to protect the confined image file from being accessed by a normal user process? i hope the question is clear. This kind of makes sVirt useless. Even a new qemu libvirt process is made to access the same file it will change the category of that file to what it generates for it self which this time lefts the old qemu process without read or write access. But thing is the new qemu process should not be able to access the confined(secured) image file at all.

Edited at 2014-01-28 08:40 pm (UTC)

Re: Thank You for your reply

No the normal user is running as unconfined_t and is allowed to do what he wants. So if you run qemu directly rather then out of libvirt it will run as unconfined_t:s0-s0:c0.c1023, and it able to read/write the image file.

If you run it out of libvirt, libvirt will relabel the process and the image files.

hello,if a virtual machine broken into the host ,and get the root permission,can it change the selinux policy,so as to promote its permission,
if it can be done,what can we do to prevent this,thank you !

SELinux would prevent the modification of policy.

Thanks,teacher,my idea is that,even if the malicious VM broken into the host,it would still labeled by its original label,and the selinux policy (include target policy) would prevent the label access host resource.isn't it?

Yes the process would have very little access to the host system.

  • 1