unlabeled_t type
danwalsh

I often see bug reports or people showing AVC messages about confined domains not able to deal with unlabeled_t files.

type=AVC msg=audit(1530786314.091:639): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file

I just saw this AVC, which shows the openvswitch domain attempting to read a file, modules.alias.bin, with modprobe.   The usual response to this is to run restorecon on the files and everything should be fine.

But the next question I get is how did this content get the label unlabeled_t, and my response is usually I don't know, you did something.

Well lets look at how unlabeled_t files get created.

unlabeled_t really just means that the file on disk does not have an SELinux xattr indicating a file label.  Here are a few ways these files can get created

1 File was created by on a file system when the kernel was not running in SELinux mode.  If you take a system that was installed without SELinux (God forbid) or someone booted the machine with SELinux disabled, then all files created will not have labels.  This is why we force a relabel, anytime someone changes from SELinux disabled to SElinux enabled at boot time.

Read more...Collapse )

Fun with DAC_OVERRIDE and SELinux
danwalsh

Lately the SELinux team has been trying to remove as many SELinux Domain Types that have DAC_OVERRIDE.

man capabilities

...

       CAP_DAC_OVERRIDE

              Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

This means a process with CAP_DAC_OVERRIDE can read any file on the system and can write any file on the system from a standard permissions point of view.  With SELinux it means that they can read all file types that SELinux allows them to read, even if they are running with a process UID that is not allowed to read the file.  Similar they are allowed to write all SELinux writable types even if they aren't allowed to write based on UID.  

Obviously most confined domains never need to have this access, but some how over the years lots of domains got added this access.  

I recently received and email asking about syslog, generating lots of AVC's.  The writer said that he understood SELinux and has set up the types for syslog to write to, and even the content was getting written properly.  But the Kernel was generating an AVC every time the service started.

Here is the AVC.

Jul 09 15:24:57

 audit[9346]: HOSTNAME AVC avc:  denied  { dac_override }  for  pid=9346 comm=72733A6D61696E20513A526567 capability=1   scontext=system_u:system_r:syslogd_t:s0  tcontext=system_u:system_r:syslogd_t:s0 tclass=capability permissive=0

Read more...Collapse )

Cool SELinux hack provide by systemd
danwalsh

Sometimes content is created in /run during boot that ends up mislabeled.  We sometimes here, every time I boot, this file gets created with the wrong label.   

This can happen if initramfs is creating content before systemd has loaded policy.  This means the content would get created with var_run_t as the label.

Well I was looking at tmpfs.d and it has a cool feature.

man tmpfs.d

...

       Z

           Recursively set the access mode, group and user, and restore the SELinux security context of a file or directory

           if it exists, as well as of its subdirectories and the files contained therein (if applicable). Lines of this type

           accept shell-style globs in place of normal path names. Does not follow symlinks.

One hack you could try, would be to add /run to the tmpfiles.d directory and systemd will relabel all of the content in /run when the system reboots.

echo "Z /run — — — — —" > /etc/tmpfiles.d/relabelrun.conf

Of course if the content gets created after the tmpfs runs with the wrong label, you are out of luck, or enabled the old service restorecond...


SELinux team works to remove DAC_OVERRIDE Permissions.
danwalsh

DAC_OVERRIDE is one of the most powerful capabilities, and most app developers don't understand when they are taking advantage of it, or how easy it is to eliminate the need.

What is DAC_OVERRIDE?

man capabilities

...

       CAP_DAC_OVERRIDE

              Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

Looking at /usr/include/linux/capability.h

#define CAP_DAC_OVERRIDE     1

/* Overrides all DAC restrictions regarding read and search on files and directories, including ACL restrictions if [_POSIX_ACL] is defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */

Giving a process this access means it can ignore file system permission checks. Admittedly everyone thinks root can do this by default anyways, but if you can eliminate this access from a system service, you really can tighten the security.  

SELinux

SELinux ignores DAC permissions, it does not care if a a processes is running as root or any other UID.  The only part of SELinux that concerns itself with UID/GID permissions is in linux capabilities like DAC_OVERRIDE.

With SELinux we often look at what process types require DAC_OVERRIDE and try to figure out if we can rid of the access.  

Read more...Collapse )
Tags:

Customizing container types
danwalsh

In my previous blog, I talked about about container types container_t and svirt_lxc_net_t. Today I get an email, asking about the new container_t type replacing svirt_lxc_net_t.

On 05/23/2018 11:50 PM, Dustin C. Hatch wrote:
I recently upgraded some of my Docker hosts to CentOS 7.5 and started getting "Permission Denied" errors inside of containers. I traced this down to any container that mounts and uses /etc/passwd from the host (so that UIDs inside the container map to the same username as on the host), because the SELinux policy in CentOS 7.5 does not allow the new container_t domain to read passwd_file_t.  
The old svirt_lxc_net_t domain had the nsswitch_domain attribute, while its replacement, container_t, does not. I cannot find any reference for this change, so I was wondering if it was deliberate or not. If it was deliberate, what would be the consequences if I were to make a local policy change to add that attribute back? If it was not deliberate, I would be happy to open a ticket in Bugzilla. 

First let's remove the misconception, container_t was not a new type replacing svirt_lxc_net_t, it was a rename (typealias) of the old type.  

Read more...Collapse )

container_t versus svirt_lxc_net_t
danwalsh

For some reason recently I have been asked via email and twitter about what the difference is between the container_t type and the svirt_lxc_net_t type. Or similarly between container_file_t and svirt_sandbox_file_t.  Bottom line, NOTHING.  They are aliases of each other.

In SELinux policy language they have a typealias  command.

typealias container_t alias svirt_lxc_net_t;

typealias container_file_t alias svirt_sandbox_file_t;

When I first started working on containers and SELinux prior to Docker, we were writing a tool called virt-sandbox that used libvirt to launch containers, specifically it used libvirt-lxc.  We had labeled all of the VMs launched by libvirt, svirt_t.  This stood for secure virt.  When I decided to write policy for the libvirt_lxc containers, I created a type called svirt_lxc_t.  This type was not allowed to do network access, so I added another type called svirt_lxc_net_t that had full network access.  The type for content that he svirt_lxc types could manage as svirt_sandbox_file_t.  (svirt_file_t was already used for virtual machine images.)  Why I did not call it svirt_lxc_file_t, I don't know. 

Read more...Collapse )

Share Certs Data into a container.
danwalsh

Last week, on the Fedora Users list someone was asking a question about getting SElinux to work with a container.  The mailer said that he was sharing certs into the container but SELinux as blocking access.

Here are the AVC's that were reported. 

Fri May 11 03:35:19 2018 type=AVC msg=audit(1526024119.640:1052): avc:  denied  { write } for   pid=13291 comm="touch" name="php-fpm.access" dev="dm-2" ino=20186094 scontext=system_u:system_r:container_t:s0:c581,c880 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0 

Looks like there is a container (container_t) that is attempting to write some content in you homedir (user_home_t). 

I surmised that the mailer must have been volume mounting a directory from his homedir into the container.

I responded to him with:

Private to container:

If these certs are only going to be used within one container you should add a :Z to the volume mount. 

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

Or if you are still using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

This causes  the container runtime to relabel the volume with a SELinux label private to the container.

Shared with other Containers

If you want the container-Cert-Dir to be shared between multiple containers, and it can be shared read/only I would add the :z,ro flags

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Read more...Collapse )

SELinux and Containers
danwalsh

Next week at the Red Hat summit, I have a short session to talk about SELinux and Containers.  I am constantly reminded in bugzilla about how great the combination is.  

It truly is like Peanut Butter and Jelly.  

Sadly, people are still surprised when it blocks access.  For example I got a bugzilla recently that talked about containers not working on Fedora.  The avc was

type=AVC msg=audit(1524873307.948:1814): avc:  denied  { connectto } for  pid=28746 comm="boinc" path=002F746D702F2E5831312D756E69782F5831 scontext=system_u:system_r:container_t:s0:c420,c759 tcontext=unconfined_u:unconfined_r:xserver_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0

This AVC shows SELinux blocking the container process from connecting to the Xserver. We definitely do not want to allow containers to connect to the Xserver.  SELinux is doing precisely what it is designed to do.

Allowing a process to connect to the XServer would allow it to screen scrape all of you data on the desktop, it would also allow it to fool humans into typing passwords.  It would also allow it to grab all data in the cut and paste buffer. Especially things like passwords.

I can imagine that this works fine on other platforms with SELinux disabled. 

Read more...Collapse )

SELinux should and does BLOCK access to Docker socket
danwalsh

I get lots of bugs from people complaining about SELinux blocking access to the Docker socket.  For example https://bugzilla.redhat.com/show_bug.cgi?id=1557893

The aggravating thing is, this is exactly what we want SELinux to prevent.  If a container process got to the point of talking to the /var/run/docker.sock, you know this is a serious security issue.  Giving a container access to the Docker socket, means you are giving it full root on  your system.  

A couple of years ago, I wrote why giving docker.sock to non privileged users is a bad idea.

Now I am getting bug reports about allowing containers access to this socket.

Access to the docker.sock is the equivalent of sudo with NOPASSWD, without any logging. You are giving the process that talks to the socket, the ability to launch a process on the system as full root.

Usually people are doing this because they want the container to do benign operations, like list which containers are on the system, or look a the container logs.  But Docker does not have a nice RBAC system, you basically get full access or no access.  I choose to default to NO ACCESS.

If you need to give container full access to the system then run it as a --privileged container or disable SELinux separation for the container.

podman run --privileged ...
or
docker run --privileged ...

podman run --security-opt label:disable ...
or
docker run --security-opt label:disable ...

Run it privileged

Read more...Collapse )

Teaching an old dog new tricks
danwalsh

I have been working on SELinux for over 15 years.  I switched my primary job to working on containers several years ago, but one of the first things I did with containers was to add SELinux support.  Now all of the container projects I work on including CRI-O, Podman, Buildah as well as Docker, Moby, Rocket, runc, systemd-nspawn, lxc ... all have SELinux support.  I also maintain the container-selinux policy package which all of these container runtimes rely on.

Any ways container runtimes started adding the no-new-privileges capabilities a couple of years ago. 

no_new_privs 

The no_new_privs kernel feature works as follows:

  • Processes set no_new_privs bit in kernel that persists across fork, clone, & exec.
  • no_new_privs bit ensures process/children processes do not gain any additional privileges.
  • Process aren't allowed to unset no_new_privs bit once set.
  • no_new_privs processes are not allowed to change uid/gid or gain any other capabilities, even if the process executes setuid binaries or executables with file capability bits set.
  • no_new_privs prevents Linux Security Modules (LSMs) like SELinux from transitioning to process labels that have access not allowed to the current process. This means an SELinux process is only allowed to transition to a process type with less privileges.

Oops that last flag is a problem for containers and SELinux.  If I am running a command like

# podman run -ti --security-opt no-new-privileges fedora sh

Read more...Collapse )

?

Log in

No account? Create an account