?

Log in

No account? Create an account

container_t versus svirt_lxc_net_t
danwalsh

For some reason recently I have been asked via email and twitter about what the difference is between the container_t type and the svirt_lxc_net_t type. Or similarly between container_file_t and svirt_sandbox_file_t.  Bottom line, NOTHING.  They are aliases of each other.

In SELinux policy language they have a typealias  command.

typealias container_t alias svirt_lxc_net_t;

typealias container_file_t alias svirt_sandbox_file_t;

When I first started working on containers and SELinux prior to Docker, we were writing a tool called virt-sandbox that used libvirt to launch containers, specifically it used libvirt-lxc.  We had labeled all of the VMs launched by libvirt, svirt_t.  This stood for secure virt.  When I decided to write policy for the libvirt_lxc containers, I created a type called svirt_lxc_t.  This type was not allowed to do network access, so I added another type called svirt_lxc_net_t that had full network access.  The type for content that he svirt_lxc types could manage as svirt_sandbox_file_t.  (svirt_file_t was already used for virtual machine images.)  Why I did not call it svirt_lxc_file_t, I don't know. 

Read more...Collapse )

Share Certs Data into a container.
danwalsh

Last week, on the Fedora Users list someone was asking a question about getting SElinux to work with a container.  The mailer said that he was sharing certs into the container but SELinux as blocking access.

Here are the AVC's that were reported. 

Fri May 11 03:35:19 2018 type=AVC msg=audit(1526024119.640:1052): avc:  denied  { write } for   pid=13291 comm="touch" name="php-fpm.access" dev="dm-2" ino=20186094 scontext=system_u:system_r:container_t:s0:c581,c880 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0 

Looks like there is a container (container_t) that is attempting to write some content in you homedir (user_home_t). 

I surmised that the mailer must have been volume mounting a directory from his homedir into the container.

I responded to him with:

Private to container:

If these certs are only going to be used within one container you should add a :Z to the volume mount. 

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

Or if you are still using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

This causes  the container runtime to relabel the volume with a SELinux label private to the container.

Shared with other Containers

If you want the container-Cert-Dir to be shared between multiple containers, and it can be shared read/only I would add the :z,ro flags

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Read more...Collapse )

SELinux and Containers
danwalsh

Next week at the Red Hat summit, I have a short session to talk about SELinux and Containers.  I am constantly reminded in bugzilla about how great the combination is.  

It truly is like Peanut Butter and Jelly.  

Sadly, people are still surprised when it blocks access.  For example I got a bugzilla recently that talked about containers not working on Fedora.  The avc was

type=AVC msg=audit(1524873307.948:1814): avc:  denied  { connectto } for  pid=28746 comm="boinc" path=002F746D702F2E5831312D756E69782F5831 scontext=system_u:system_r:container_t:s0:c420,c759 tcontext=unconfined_u:unconfined_r:xserver_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0

This AVC shows SELinux blocking the container process from connecting to the Xserver. We definitely do not want to allow containers to connect to the Xserver.  SELinux is doing precisely what it is designed to do.

Allowing a process to connect to the XServer would allow it to screen scrape all of you data on the desktop, it would also allow it to fool humans into typing passwords.  It would also allow it to grab all data in the cut and paste buffer. Especially things like passwords.

I can imagine that this works fine on other platforms with SELinux disabled. 

Read more...Collapse )

SELinux should and does BLOCK access to Docker socket
danwalsh

I get lots of bugs from people complaining about SELinux blocking access to the Docker socket.  For example https://bugzilla.redhat.com/show_bug.cgi?id=1557893

The aggravating thing is, this is exactly what we want SELinux to prevent.  If a container process got to the point of talking to the /var/run/docker.sock, you know this is a serious security issue.  Giving a container access to the Docker socket, means you are giving it full root on  your system.  

A couple of years ago, I wrote why giving docker.sock to non privileged users is a bad idea.

Now I am getting bug reports about allowing containers access to this socket.

Access to the docker.sock is the equivalent of sudo with NOPASSWD, without any logging. You are giving the process that talks to the socket, the ability to launch a process on the system as full root.

Usually people are doing this because they want the container to do benign operations, like list which containers are on the system, or look a the container logs.  But Docker does not have a nice RBAC system, you basically get full access or no access.  I choose to default to NO ACCESS.

If you need to give container full access to the system then run it as a --privileged container or disable SELinux separation for the container.

podman run --privileged ...
or
docker run --privileged ...

podman run --security-opt label:disable ...
or
docker run --security-opt label:disable ...

Run it privileged

Read more...Collapse )

Teaching an old dog new tricks
danwalsh

I have been working on SELinux for over 15 years.  I switched my primary job to working on containers several years ago, but one of the first things I did with containers was to add SELinux support.  Now all of the container projects I work on including CRI-O, Podman, Buildah as well as Docker, Moby, Rocket, runc, systemd-nspawn, lxc ... all have SELinux support.  I also maintain the container-selinux policy package which all of these container runtimes rely on.

Any ways container runtimes started adding the no-new-privileges capabilities a couple of years ago. 

no_new_privs 

The no_new_privs kernel feature works as follows:

  • Processes set no_new_privs bit in kernel that persists across fork, clone, & exec.
  • no_new_privs bit ensures process/children processes do not gain any additional privileges.
  • Process aren't allowed to unset no_new_privs bit once set.
  • no_new_privs processes are not allowed to change uid/gid or gain any other capabilities, even if the process executes setuid binaries or executables with file capability bits set.
  • no_new_privs prevents Linux Security Modules (LSMs) like SELinux from transitioning to process labels that have access not allowed to the current process. This means an SELinux process is only allowed to transition to a process type with less privileges.

Oops that last flag is a problem for containers and SELinux.  If I am running a command like

# podman run -ti --security-opt no-new-privileges fedora sh

Read more...Collapse )

Containers and MLS
danwalsh

I have just updated the container-selinux policy to support MLS (Multi Level Security).  

SELinux and Container technology have a long history together.  Some people imagine that containers started just a few years ago with the introduction of Docker, but the technology goes back a lot further then that. 

SELinux originally supported two type of Mandatory Access Control,  Type Enforcement and RBAC (Roles Based Access Control).  

Type Enforcement

Type Enforcement is SELinux main security measure.  Basically every process on the system gets a assigned a type (httpd_t, unconfined_t, container_t ...)  And every object (file, directory, socket, tcp port ...) gets assigned a type. (httpd_sys_content_t, user_home_t, httpd_port_t, container_file_t).  Then we write rules that define the access between process types and object types. Note: The type field is always the third field of the SELinus label.

allow container_t container_file_t:file { open read write }

Anything that is not allowed is denied.

RBAC (Roles Based Access Control)

Read more...Collapse )

Attributes make writing SELinux policy easier
danwalsh

Yesterday I received an email from someone who was attempting to write SELinux policy for a daemon process, "abcd", that he was being required to run on his systems.

"

...

 Here's the problem:  the ****** agent runs as root and has the ability to
make changes to system configuration files.  So deploying this agent
means handing over root-level control of all of my RHEL servers to
people I don't know and who have no accountability to my system owner or
authorizing official.  ..... really, really
uncomfortable about this and want to protect our system from
unauthorized changes inflicted by some panicking yabbo ....

This seems like a job for SELinux.  Our RHEL7 baseline has SELinux
enforcing (
AWESOME) already, so I figured we just need to write a custom policy
that defines a type for the ******* agent process (abcd_t) and confines
that so that it can only access the agent's own data directories.

We did that, and it's not working out.  Once we start the confined
agent, it spews such a massive volume of AVC denials that it threatens
to fill the log volume and the agent never actually starts.  There are
way too many errors for the usual 'read log/allow exception' workflow to
be viable.  My guess is the agent is trying to read a bunch of sensitive
system files and choking because SELinux won't let it.

Have you got any advice or ideas here?

...

Here is my response:

Read more...Collapse )

SELinux blocks loading kernel modules
danwalsh

The kernel has a feature where it will load certain kernel modules for a process, when certain syscalls are made.  For example, loading a kernel module when a process attempts to create a different network socket.  

I wrote a blog on https://medium.com/cri-o explaining how this is probably a bad idea from a containers perspective.  I don't want to allow container processes to trigger modifications of the kernel.  And potentially causing the kernel to load risky modules that could have vulnerabilities in them.  I say, let the Administrator or packagers decide what kernel modules need to be loaded and then make the containers live with what is provided for them.  Here is a link to the blog.

https://medium.com/cri-o/cri-o-has-builtin-selinux-support-6ff45b707cf0


Why all the DAC_READ_SEARCH AVC messages?
danwalsh

If you followed SELinux policy bugs being reported in bugzilla you might have noticed a spike in messages about random domains being denied DAC_READ_SEARCH.

Let's quickly look at what the DAC_READ_SEARCH capability is.  In Linux the power of "root" was broken down into 64 distinct capabilities.  Things like being able to load kernel modules or bind to ports less then 1024.  Well DAC_READ_SEARCH is one of these.

DAC stands for Discretionary Access Control, which is what most people understand as standard Linux permissions, Every process has owner/group.  All file system objects are  assigned owner, group and permission flags.  DAC_READ_SEARCH allows a privilege process to ignore parts of DAC for read and search.

man capabilities

...

       CAP_DAC_READ_SEARCH

              * Bypass file read permission checks and directory read and execute permission checks;

There is another CAPABILITY called DAC_OVERRIDE

       CAP_DAC_OVERRIDE

              Bypass file read, write, and execute permission checks.

As you can see DAC_OVERRIDE is more powerful then DAC_READ_SEARCH, in that it can write and execute content ignoring DAC rules, as opposed to just reading the content.

Well why did we suddenly see a spike in confined domains needing DAC_READ_SEARCH?

Read more...Collapse )

New version of buildah 0.2 released to Fedora.
danwalsh
New features and bugfixes in this release

Updated Commands
buildah run
     Add support for -- ending options parsing
     Add a way to disable PTY allocation
     Handle run without an explicit command correctly
Buildah build-using-dockerfile (bud)
    Ensure volume points get created, and with perms
buildah containers
     Add a -a/--all option - Lists containers not created by buildah.
buildah Add/Copy
     Support for glob syntax
buildah commit
     Add flag to remove containers on commit
buildah push
     Improve man page and help information
buildah export:
    Allows you to export a container image
buildah images:
    update commands
    Add JSON output option
buildah rmi
    update commands
buildah containers
     Add JSON output option

New Commands
buildah version
     Identify version information about the buildah command
buildah export
     Allows you to export a containers image

Updates
Buildah docs: clarify --runtime-flag of run command
Update to match newer storage and image-spec APIs
Update containers/storage and containers/image versions