?

Log in

No account? Create an account

Container Labeling
danwalsh

An issue was recently raised on libpod, the github repo for Podman.

"container_t isn't allowed to access container_var_lib_t"

Container policy is defined in the container-selinux package. By default containers run with the SELinux type "container_t" whether this is a container launched by just about any container engine like: podman, cri-o, docker, buildah, moby.  And most people who use SELinux with containers from container runtimes like runc, systemd-nspawn use it also.

By default container_t is allowed to read/execute labels under /usr, read generically labeled content in the hosts /etc directory (etc_t). 

The default label for content in /var/lib/docker and  /var/lib/containers is container_var_lib_t, This is not accessible by  containers, container_t,  whether they are running under podman, cri-o,  docker, buildah ...  We specifically do not want containers to be able to read this content, because content that uses block devices like devicemapper and btrfs(I believe) is labeled container_var_lib_t, when the containers are not running.  

For overlay content we need to allow containers to read/execute the content, we use the type container_share_t, for this content.  So container_t is allowed to read/execute container_share_t files, but not write/modify them.

Content under /var/lib/containers/overlay* and /var/lib/docker/overlay* is labeled container_share_ by default.

Read more...Collapse )

SELinux blocks podman container from talking to libvirt
danwalsh

I received this bug report this week.

"I see this when I try to use vagrant from a container using podman on Fedora 29 Beta.

Podman version: 0.8.4

Command to run container:

sudo podman run -it --rm -v /run/libvirt:/run/libvirt:Z -v $(pwd):/root:Z localhost/vagrant vagrant up

Logs:

...

Sep 30 21:17:25 Home audit[22760]: AVC avc:  denied  { connectto } for  pid=22760 comm="batch_action.r*" path="/run/libvirt/libvirt-sock" scontext=system_u:system_r:container_t:s0:c57,c527 tcontext=system_u:system_r:virtd_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0

"

This is an interesting use case of using SELinux and containers.  SELinux is protecting the file system, and the host from attack from inside of the container.  People who have listened to me over the years understand that SELinux is protecting the label of files, in the case of containers, it only allows a container_t to read/write/execute files labeled container_file_t.

But the reporter of the bug, thinks he did the right thing, he told podman to relabel the volumes he was mounting into the container.

Lets look at his command to launch the container.

sudo podman run -it --rm -v /run/libvirt:/run/libvirt:Z -v $(pwd):/root:Z localhost/vagrant vagrant up

Read more...Collapse )

SELinux prevent users from executing programs, for security? Who cares.
danwalsh

I recently received the following email about using SELinux to prevent users from executing programs.
 

I just started to learn SELinux and this is nice utility if you want confine any user who interact with your system.

A lot of information on Net about how to confine programs, but can't find about confining man's :)

I found rbash (https://access.redhat.com/solutions/65822) which help me forbid execution any software inside and outside user home directory except few.

As I understand correctly to do this using SELinux I need a new user domain(customuser)  which by default should deny all or I can start with predefined       guest_t?

Next then for example I can enable netutils_exec_ping(customuser_t, customuser_r).

I responded that:

SELinux does not worry so much about executing individual programs, although it can do this.  SELinux is basically about  defining the access of a process type.  
Just because a program can execute another program does not mean  that this process type is going to be allowed the access that the program requires.  For example.  

A user running as guest_t can execute su and sudo, and even if the user might discover the       correct password to become root, they can not become root on the system, SELinux would block it.  Similarly guest_t is not allowed to connect out of the system, so being able to execute ssh or ping does not mean that the user would be able to ping another host or       ssh to another system.

Read more...Collapse )

unlabeled_t type
danwalsh

I often see bug reports or people showing AVC messages about confined domains not able to deal with unlabeled_t files.

type=AVC msg=audit(1530786314.091:639): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file

I just saw this AVC, which shows the openvswitch domain attempting to read a file, modules.alias.bin, with modprobe.   The usual response to this is to run restorecon on the files and everything should be fine.

But the next question I get is how did this content get the label unlabeled_t, and my response is usually I don't know, you did something.

Well lets look at how unlabeled_t files get created.

unlabeled_t really just means that the file on disk does not have an SELinux xattr indicating a file label.  Here are a few ways these files can get created

1 File was created by on a file system when the kernel was not running in SELinux mode.  If you take a system that was installed without SELinux (God forbid) or someone booted the machine with SELinux disabled, then all files created will not have labels.  This is why we force a relabel, anytime someone changes from SELinux disabled to SElinux enabled at boot time.

Read more...Collapse )

Fun with DAC_OVERRIDE and SELinux
danwalsh

Lately the SELinux team has been trying to remove as many SELinux Domain Types that have DAC_OVERRIDE.

man capabilities

...

       CAP_DAC_OVERRIDE

              Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

This means a process with CAP_DAC_OVERRIDE can read any file on the system and can write any file on the system from a standard permissions point of view.  With SELinux it means that they can read all file types that SELinux allows them to read, even if they are running with a process UID that is not allowed to read the file.  Similar they are allowed to write all SELinux writable types even if they aren't allowed to write based on UID.  

Obviously most confined domains never need to have this access, but some how over the years lots of domains got added this access.  

I recently received and email asking about syslog, generating lots of AVC's.  The writer said that he understood SELinux and has set up the types for syslog to write to, and even the content was getting written properly.  But the Kernel was generating an AVC every time the service started.

Here is the AVC.

Jul 09 15:24:57

 audit[9346]: HOSTNAME AVC avc:  denied  { dac_override }  for  pid=9346 comm=72733A6D61696E20513A526567 capability=1   scontext=system_u:system_r:syslogd_t:s0  tcontext=system_u:system_r:syslogd_t:s0 tclass=capability permissive=0

Read more...Collapse )

Cool SELinux hack provide by systemd
danwalsh

Sometimes content is created in /run during boot that ends up mislabeled.  We sometimes here, every time I boot, this file gets created with the wrong label.   

This can happen if initramfs is creating content before systemd has loaded policy.  This means the content would get created with var_run_t as the label.

Well I was looking at tmpfs.d and it has a cool feature.

man tmpfs.d

...

       Z

           Recursively set the access mode, group and user, and restore the SELinux security context of a file or directory

           if it exists, as well as of its subdirectories and the files contained therein (if applicable). Lines of this type

           accept shell-style globs in place of normal path names. Does not follow symlinks.

One hack you could try, would be to add /run to the tmpfiles.d directory and systemd will relabel all of the content in /run when the system reboots.

echo "Z /run — — — — —" > /etc/tmpfiles.d/relabelrun.conf

Of course if the content gets created after the tmpfs runs with the wrong label, you are out of luck, or enabled the old service restorecond...


SELinux team works to remove DAC_OVERRIDE Permissions.
danwalsh

DAC_OVERRIDE is one of the most powerful capabilities, and most app developers don't understand when they are taking advantage of it, or how easy it is to eliminate the need.

What is DAC_OVERRIDE?

man capabilities

...

       CAP_DAC_OVERRIDE

              Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

Looking at /usr/include/linux/capability.h

#define CAP_DAC_OVERRIDE     1

/* Overrides all DAC restrictions regarding read and search on files and directories, including ACL restrictions if [_POSIX_ACL] is defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */

Giving a process this access means it can ignore file system permission checks. Admittedly everyone thinks root can do this by default anyways, but if you can eliminate this access from a system service, you really can tighten the security.  

SELinux

SELinux ignores DAC permissions, it does not care if a a processes is running as root or any other UID.  The only part of SELinux that concerns itself with UID/GID permissions is in linux capabilities like DAC_OVERRIDE.

With SELinux we often look at what process types require DAC_OVERRIDE and try to figure out if we can rid of the access.  

Read more...Collapse )
Tags:

Customizing container types
danwalsh

In my previous blog, I talked about about container types container_t and svirt_lxc_net_t. Today I get an email, asking about the new container_t type replacing svirt_lxc_net_t.

On 05/23/2018 11:50 PM, Dustin C. Hatch wrote:
I recently upgraded some of my Docker hosts to CentOS 7.5 and started getting "Permission Denied" errors inside of containers. I traced this down to any container that mounts and uses /etc/passwd from the host (so that UIDs inside the container map to the same username as on the host), because the SELinux policy in CentOS 7.5 does not allow the new container_t domain to read passwd_file_t.  
The old svirt_lxc_net_t domain had the nsswitch_domain attribute, while its replacement, container_t, does not. I cannot find any reference for this change, so I was wondering if it was deliberate or not. If it was deliberate, what would be the consequences if I were to make a local policy change to add that attribute back? If it was not deliberate, I would be happy to open a ticket in Bugzilla. 

First let's remove the misconception, container_t was not a new type replacing svirt_lxc_net_t, it was a rename (typealias) of the old type.  

Read more...Collapse )

container_t versus svirt_lxc_net_t
danwalsh

For some reason recently I have been asked via email and twitter about what the difference is between the container_t type and the svirt_lxc_net_t type. Or similarly between container_file_t and svirt_sandbox_file_t.  Bottom line, NOTHING.  They are aliases of each other.

In SELinux policy language they have a typealias  command.

typealias container_t alias svirt_lxc_net_t;

typealias container_file_t alias svirt_sandbox_file_t;

When I first started working on containers and SELinux prior to Docker, we were writing a tool called virt-sandbox that used libvirt to launch containers, specifically it used libvirt-lxc.  We had labeled all of the VMs launched by libvirt, svirt_t.  This stood for secure virt.  When I decided to write policy for the libvirt_lxc containers, I created a type called svirt_lxc_t.  This type was not allowed to do network access, so I added another type called svirt_lxc_net_t that had full network access.  The type for content that he svirt_lxc types could manage as svirt_sandbox_file_t.  (svirt_file_t was already used for virtual machine images.)  Why I did not call it svirt_lxc_file_t, I don't know. 

Read more...Collapse )

Share Certs Data into a container.
danwalsh

Last week, on the Fedora Users list someone was asking a question about getting SElinux to work with a container.  The mailer said that he was sharing certs into the container but SELinux as blocking access.

Here are the AVC's that were reported. 

Fri May 11 03:35:19 2018 type=AVC msg=audit(1526024119.640:1052): avc:  denied  { write } for   pid=13291 comm="touch" name="php-fpm.access" dev="dm-2" ino=20186094 scontext=system_u:system_r:container_t:s0:c581,c880 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0 

Looks like there is a container (container_t) that is attempting to write some content in you homedir (user_home_t). 

I surmised that the mailer must have been volume mounting a directory from his homedir into the container.

I responded to him with:

Private to container:

If these certs are only going to be used within one container you should add a :Z to the volume mount. 

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

Or if you are still using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

This causes  the container runtime to relabel the volume with a SELinux label private to the container.

Shared with other Containers

If you want the container-Cert-Dir to be shared between multiple containers, and it can be shared read/only I would add the :z,ro flags

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Read more...Collapse )