Container Domains (Types)

One of the things people have always had a hard time understanding about SELinux is around different types.  In this blog, I am going to discuss Contianer Domains.

Recently I had someone questioning me about specifying types to run containers inside of Kubernetes.  Basically he wanted to run a locked down container that could read and write content inside of /var/log.  He saw that the content in /var/log was labeled var_log_t, and made the assumption that he would run the container with var_lot_t and it would be able to manage content with that label.  

This is not a crazy assumption, after all in DAC, if a file is owned by the user dwalsh, usually processes owned by dwalsh are able to read and write them. (If the permission flags allow it).  But in SELinux type enforcement is different.  CRI-O failed to execute the container process for Kubernetes and an AVC was generated that looked like:

type=AVC msg=audit(1558135492.958:247182): avc:  denied  { transition } for  pid=22423 comm="runc:[2:INIT]" path="/usr/bin/pod" dev="sda1" ino=570425443 scontext=system_u:system_r:container_runtime_t:s0 tcontext=system_u:object_r:var_log_t:s0 tclass=process permissive=0

SELinux differentiates process types, it calls them domains, from file types, it calls them file_types.  SELinux also controls process type changing via something called transitions.  Processes can not just execute a new process in any process type.  SELinux only allows certain process types to transition to other process types via policy.  The AVC above shows that runc, which was executed via CRI-O running as container_runtime_t, is attempting to launch the container as var_log_t.  SELinux blocks the transition and generates the AVC.

Container Domains

SELinux policy for containers defines the types that a container_runtime_t process can transition to and groups them with the policy attribute `container_domain`.

You can see these domains by executing `seinfo -acontainer_domain -x`.

seinfo -acontainer_domain -x
Type Attributes: 1
  attribute container_domain;
container_logreader_t
container_t
container_userns_t
spc_t

Currently we only define these 4 types for containers to run with.  The default is container_t, which almost everyone in the world runs with.  The only type that container_t can write is container_file_t.  Since the original idea of the user was the need to read/write content in /var/log.  Someone suggested relabeling the content in /var/log as container_file_t.  This would be a very bad idea, because other process types like syslogd_t or other domains that need to write to files under /var/log, would not be allowed to write to container_file_t.  We have another type we have written container_logreader_t, but this would not work, since this is only allowed to read content under /var/log not write it. 

Sadly the only option with the way container policy allows the container to write to /var/log is running as spc_t. spc_t is for "Super Privileged Containers", basically unconfined containers from an SELinux point of view that can read/write any file type.

That is what I told the user to run with.

UDICA

But their is help coming.  Lukas Vrabec has written a new tool called udica.  

Udica: A tool for generating SELinux security profiles for containers.

Lukas came up with a great idea to examine the volumes mounted into a container and generate policy off of it.  You can create a container using a tool like Podman, with a command like

podman create -ti --name logwriter -v /var/log:/var/log fedora

Now you can use the generated container as input to udica, to generate an SELinux policy type that could read/write content under /var/log.

# podman inspect logwriter > container.json
# udica -j container.json  container_logwriter

This generates a new policy file container_logwriter.cil 

Now you can load the policy into the kernel.

# semodule -i container_logwriter.cil /usr/share/udica/templates/{base_container.cil,net_container.cil,home_container.cil} 

Udica generated a new policy type, container_logwriter.process, which is defined as a container_domain, but also gets all of the allow rules to read/write all file types stored under the /var/log directory.  Now you can run the container with a policy type as container_logwriter.process and it will be continue to be confined in the same way as contianer_t, except that it will be allowed to read/write all labels in /var/log directories.

podman run -ti --security-opt label:type=container_logwriter.cil -v /var/log:/var/log fedora 

With this tool, you could distribute the policy files to all nodes that you want to run these containers on, and tell kubernetes to launch containers with this type.

You can read a lot more about Udica at github, or in Fedora Magazine.

Conclusion

SELinux is an awesome tool for keeping containers contained, it prevents containers from escaping and causing havoc on the file system.  It has proven its self many times when container breakouts have happened.  But sometimes, you need to customize your container to allow it limited access to files on your host.  If those files need to be accessed by other confined domains other the then containers, then relabeling them is not an option.  You either need to turn off SELinux confinement or start using the new Udica tool to generate new policy types.


Error

Anonymous comments are disabled in this journal

default userpic

Your reply will be screened