What is this new unconfined_service_t type I see on Fedora 21 and RHEL7?
Everyone that has ever used SELinux knows that the unconfined_t domain is a process label that is not confined.  But this is not the only unconfined domain on a SELinux system.  It is actually the default domain of a user that logs onto a system.  In a lot of ways we should have used the type unconfined_user_t rather then unconfined_t.

By default in an SELinux Targeted system there are lots of other unconfined domains.  We have these so that users can run programs/services without SELinux interfering if SELinux does not know about them. You can list the unconfined domains on your system using the following command.

seinfo -aunconfined_domain_type -x

In RHEL6 and older versions of Fedora, we used to run system services as initrc_t by default.  Unless someone has written a policy for them.  initrc_t is an unconfined domain by default, unless you disabled the unconfined.pp module. Running unknown serivices as initrc_t allows administrators to run an application service, even if no policy has never been written for it.

In RHEL6 we have these rules:

init_t @initrc_exec_t -> initrc_t
init_t @bin_t -> initrc_t

If an administrator added an executable service to /usr/sbin or /usr/bin, the init system would run the service as initrc_t.

We found this to be problematic, though. 

The problem was that we have lots of transition rules out of initrc_t.  If a program we did not know about was running as initrc_t and executed a program like rsync to copy data between servers, SELinux would transition the program to rsync_t and it would blow up.  SELinux mistakenly would think that rsync was set up in server mode, not client mode.  Other transition rules could also cause breakage. 

We decided we needed a new unconfined domain to run services with, that would have no transition rules.  We introduced the unconfined_service_t domain.  Now we have:

init_t @bin_t -> unconfined_service_t

A process running as unconfined_service_t is allowed to execute any confined program, but stays in the unconfined_service_t domain.  SELinux will not block any access. This means by default, if you install a service that does not have policy written for it, it should work without SELinux getting in the way.

Sometimes applications are installed in fairly random directories under /usr or /opt (Or in oracle's case /u01), which end up with the label of usr_t, therefore we added these transition rules to policy.

# sesearch -T -s init_t  | grep unconfined_service_t
type_transition init_t bin_t : process unconfined_service_t;
type_transition init_t usr_t : process unconfined_service_t;
You can see it in Fedora21.

Bottom Line

Hopefully unconfined_service_t will make leaving SELinux enabled easier on systems that have to run third party services, and protect the other services that run on your system.

Thanks to Simon Sekidde and Miroslav Grepl for helping to write this blog.

Interview on Docker Security on SDTimes.com

Think before you just blindly audit2allow -M mydomain
Don't Allow Domains to write Base SELinux Types

A few years ago I wrote a blog and paper on the four causes of SELinux errors.

The first two most common causes were labeling issues and SELinux needs to know.

Easiest way to explain this is a daemon wants to write to a certain file and SELinux blocks
the application from writing.  In SELinux terms the Process DOMAIN (httpd_t) wants to write to the file type (var_lib_t)
and it is blocked.  Users have potentially three ways of fixing this.

  1. Change the type of the file being written.

    • The object might be mislabeled and restorecon of the object fixes the issue

    • Change the label to httpd_var_lib_t using semanage and restorecon
        semanage fcontext -a -t httpd_var_lib_t '/var/lib/foobar(/.*)?'
        restorecon -R -v /var/lib/foobar

  2. There might be a boolean available to allow the Process Domain to write to the file type
      setsebool -P HTTP_BOOLEAN 1

  3. Modify policy using audit2allow
      grep httpd_t /var/log/audit/audit.log | audit2allow -M myhttp
      semodule -i myhttpd.pp

Sadly the third option is the least recommended and the most often used. 

The problem is it requires no thought and gets SELinux to just shut up.

In RHEL7 and latest Fedoras, the audit2allow tools will suggest a boolean when you run the AVC's through it.  And setroubleshoot has been doing this for years. setroubleshoot even will suggest potential types that you could change the destination object to use.

The thing we really want to stop is domains writing to BASE types.  If I allow a confined domain to write to a BASE type like etc_t or usr_t, then a hacked system can attack other domains, since almost all other domains need to read some etc_t or usr_t content.


One other feature we have added in RHEL7 and Fedora is a list of base types.  SELinux has a mechanism for grouping types based on an attribute.
We have to new attributes base_ro_file_type and base_file_type.  You can see the objects associated with these attributes using the seinfo command.

seinfo -abase_ro_file_type -x

$ seinfo -abase_file_type -x

If you use audit2allow to add a rule to allow a domain to write to one of the base types:

Most likely you are WRONG

If you have a domain that is attempting to write to one of these base types, then you most likely need to change the type of the destination object using the semanage/restorecon commands mentioned above.
The difficult thing for the users to figure out; "What type should I change the object to?"

We have added new man pages that show you the types that you program is allowed to write

man httpd_selinux

Look for writable types?

If your domain httpd_t is attempting to write to var_lib_t then look for httpd_var_lib_t. "sepolicy gui" is a new gui tool to help you understand the types also.

Call to arms:
If an enterprising hacker wanted to write some code, it would be nice to build this knowledge into audit2allow.  Masters Thesis anyone???

pam_mkhomedir versus SELinux -- Use pam_oddjob_mkhomedir
SELinux is all about separation of powers, minamal privs or reasonable privs.

If  you can break a program into several separate applications, then you can use SELinux to control what each application is allowed.  Then SELinux could prevent a hacked application from doing more then expected.

The pam stack was invented a long time ago to allow customizations of the login process.  One problem with the pam_stack is it allowed programmers to slowly hack it up to give the programs more and more access.  I have seen pam modules that do some crazy stuff.

Since we confine login applications with SELinux, we sometimes come in conflict with some of the more powerful pam modules.
We in the SELinux world want to control what login programs can do.  For example we want to stop login programs like sshd from reading/writing all content in your homedir.

Why is this important?

Over the years it has been shown that login programs have had bugs that led to information leakage without the users ever being able to login to a system.

One use case of pam, was the requirement of creating a homedir, the first time a user logs into a system.  Usually colleges and universities use this for students logging into a shared service.  But many companies use it also.

man pam_mkhomedir
  The pam_mkhomedir PAM module will create a users home directory if it does not exist when the session begins. This allows    users to be present in central database (such as NIS, kerberos or LDAP) without using a distributed file system or pre-creating a large number of directories. The skeleton directory (usually /etc/skel/) is used to copy default files and also sets a umask for the creation.

This means with pam_mkhomedir, login programs have to be allowed to create/read/write all content in your homedir.  This means we would have to allow sshd or xdm to read the content even if the user was not able to login, meaning a bug in one of these apps could allow content to be read or modified without the attacker ever logging into the machine.

man pam_oddjob_mkhomedir
       The pam_oddjob_mkhomedir.so module checks if the user's home  directory exists,  and  if it does not, it invokes the mkhomedirfor method of the com.redhat.oddjob_mkhomedir service for the PAM_USER if the  module  is running with superuser privileges.  Otherwise, it invokes the mkmyhome‐dir method.
       The location of the skeleton directory and the default umask are deter‐mined  by  the  configuration for the corresponding service in oddjobd-mkhomedir.conf, so they can not be specified as arguments to this  module.
       If  D-Bus  has  not been configured to allow the calling application to invoke these methods provided as part of the  com.redhat.oddjob_mkhome‐dir interface of the / object provided by the com.redhat.oddjob_mkhome‐dir service, then oddjobd will not receive the  request  and  an  error  will be returned by D-Bus.

Nalin Dahyabhai wrote pam_oddjob_mkhomedir many years ago to separate out the ability to create a home directory and all of the content from the login programs.  Basically the pam module sends a dbus signal to a dbus service oddjob, which launches a tool to create the homedir and its content.  SELinux policy is written to allow this application to succeed.   We end up with much less access required for the login programs.

If you want the home directory created at login time if it does not exist. Use pam_oddjob_mkhomedir instead of pam_mkhomedir.

DAC_READ_SEARCH/DAC_OVERRIDE - common SELinux issue that people handle badly.
MYTH: ROOT is all powerful.

Root is all powerful is a common misconception by administrators and users of Unix/Linux systems.  Many years ago the Linux kernel tried to break the power of root down into a series of capabilities.  Originally there were 32 capabilities, but recently that grew to 64.  Capabilities allowed programmers to code application in such a way that the ping command can create rawip-sockets or httpd can bind to a port less then 1024 and then drop all of the other capabilities of root.

SELinux also controls the access to all of the capabilities for a process.    A common bugzilla is for a process requiring the DAC_READ_SEARCH or DAC_OVERRIDE capability.  DAC stands for Discretionary Access Control.  DAC Means standard Linux Ownership/permission flags.  Lets look at the power of the capabilities.

more /usr/include/linux/capability.h
/* Override all DAC access, including ACL execute access if
   [_POSIX_ACL] is defined. Excluding DAC access covered by

#define CAP_DAC_OVERRIDE     1

/* Overrides all DAC restrictions regarding read and search on files
   and directories, including ACL restrictions if [_POSIX_ACL] is
   defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */


If you read the descriptions these basically say a process running as UID=0 with DAC_READ_SEARCH can read any file on the system, even if the permission flags would not allow a root process to read it.  Similarly DAC_OVERRIDE, means the process can ignore all permission/ownerships of all files on the system.  Usually when I see AVC messages that require this access, I take a look at the process UID, and almost always I see the process is running as uid=0.

What users often do when they see this access denial is to add the permissions, which is almost always wrong.  These AVC's indicate to me that you have permission flags to tight on a file. Usually a config file.

Imagine the httpd process needs to read /var/lib/myweb/content which is owned by the httpd user and has permissions 600 set on it.

 ls -l /var/lib/myweb/content
-rw-------. 1 apache apache 0 May 12 13:50 /var/lib/myweb/content

If for some reason the httpd process needs to read this file while it is running as UID=0, the system will deny access and generate a DAC_* AVC.  A simple fix would be to change the permission on the file to be 644.

# chmod 644 /var/lib/myweb/content
# ls -l /var/lib/myweb/content
-rw-r--r--. 1 apache apache 0 May 12 13:50 /var/lib/myweb/content

Which would now allow a root process to read the file using the "other" permissions.

Another option would be to change the group to root and change the permissions to 640.

# chmod 640 /var/lib/myweb/content
# chgrp root /var/lib/myweb/content
# ls -l /var/lib/myweb/content
-rw-r-----. 1 apache root 0 May 12 13:50 /var/lib/myweb/content

Now root can read the file based on the group permissions. but others can not read it.  You could also use ACLs to provide access.    Bottom line this is probably not an SELinux issue, and not something you want to loosen SELinux security around.

One problem with SELinux system here is the capabilities AVC message does not tell you which object on the file system blocked the access by default.  The reason for this is performance as I explained in previous blog.

Why doen't SELinux give me the full path in an error message?

If you turn on full auditing and regenerate the AVC, you will get the path of the object with the bad DAC Controls, as I explained in the blog.

Writing custom policy for an Apache Application.
Received the following email this week:

I've a PHP application that sends data to a USB tty device e.g. /dev/usbDataCollector

Unfortunately selinux is blocking this action. When set to permissive, the alert browser suggests the command:

setsebool -P daemons_use_tty 1

The documentation says Allow all daemons the ability to use unallocated ttys. This naturally doesn't sound like a good idea although admittedly it probably won't hurt in this particular installation. However, I thought it would be good to find the 'correct' solution to this.

But I am unable to find a more fine grain SELinux control for this, Fedora 20 has no documentation and the only vaguely relevant one I could find elsewhere is httpd_tty_com which appears unrelated as it is about allow httpd to communicate with terminal.

So the question is whether there is any way to do this or is allowing all daemons the only option?

My Answer

Simplest would be to just use

# grep usbDataCollector /var/log/audit/audit.log | audit2allow -M myhttp
# semodule -i myhttp.pp

This would allow all httpd_t processes the ability to use usb_device_t, of course if you had other usb_device_t devices on your system, your apache processes would also gain access to them.

Tighter Controls

SELinux is a labeling system, so you can manipulate the labels on the system to get tighter controls.

If you really wanted to tighten it up, you could build a custom policy that put a different label on /dev/usbDataCollector and allow httpd_t processes access to this label.

Something like

# cat myhttp.te
policy_module(myhttp, 1.0)
type httpd_t;

type httpd_device_t;

allow httpd_t httpd_device_t:chr_file rw_chr_file_perms;

Note that I am create a new type httpd_device_t, and I define it as a device node, which gives it attributes to allow domains that manage devices to manage this new device. Then I allow the apache process type, httpd_t, to be able to read/write chr_files with this label.

# cat myhttpd.fc
/dev/usbDataCollector -c

I also want to put the label on the device automatically so I add a file context file with the /dev/usbDataCollector labeled as httpd_device_t.

# make -f /usr/share/selinux/devel/Makefile
# semodule -i myhttpd.pp
# restorecon -v /dev/usbDataCollector

Finally I compile the myhttpd.te an myhttpd.fc file into a myhttpd.pp policy package, and install it on the system. Since the device is probably already created I need to run restorecon on it to fix the label. udev will set the label automatically on the next reboot.

Now httpd_t processes would only be able to use the /dev/usbDataCollector chr_file and not other usb devices on the system.

DAC check before MAC check. SELinux will stop wine'ing.
When it comes to SELinux, one of the most aggravating bugs we see are when the kernel does a MAC check before a DAC Check. 

This means SELinux checks happen before normal ownership/permission checks.  I always prefer to have the DAC check happen first.  This is important because code that is attempting the denied access usually will handle the EPERM silently and go down a different code path.    But if a MAC Failure happens, SELinux writes an AVC to the audit log, and setroubleshoot reports it to the user.

One of the biggest offenders of this was the mmap_zero check.  Every time a process tries to map low kernel memory, the kernel denies it, in both DAC and MAC.  Wine applications are notorious for this.  We block mmap_zero because it can potentially trigger kernel bugs which can lead to privilege escalation.

Eric Paris explains the vulnerability here.

Since the MAC check was done before the DAC check, the wine applications tend to work correctly.  When the wine application attempts to mmap low memory, it gets denied, and then reattempts the mmap with a higher memory value.  On an SELinux system the kernel generates AVC.  The user sees something like:

SELinux is preventing /usr/bin/wine-preloader from 'mmap_zero' accesses on the memprotect.

Reading about the mmap_zero, scares the user and they think their machine is vulnerable.  The only thing SELinux policy writers can do is write a dontaudit rule or allow the access, which defeats the purpose of the check.

We still want to block this access if a privileged confined process got it and report the SELinux violation.   If an confined application running as root, attempts a mmap_zero access, SELinux should block it and report the AVC.  If a normal unprivileged process triggered the access check, we would prefer to allow DAC to handle it, and not print the message.

To give you an idea of how often people have seen this; Google "SELinux mmap_zero" and you will get more then 13,000 hits.

Today the upstream kernel has been fixed to report check for mmap_zero for MAC AFTER DAC.

Thanks to Eric Paris and Paul Moore for fixing this issue.

SELinux Transitions do not happen on mountpoints mounted with nosuid.
Today one of our customers was trying to run openshift enterprise and it was blowing up because of SELinux.
Openshift sets up the Apache daemon to run /var/www/openshift/broker/script/broker_ruby.

When looked at the log, it was stating that Apache was not allowed to execute broker_ruby permission denied.

ls -lZ /var/www/openshift/broker/script/broker_ruby
Shows that broker_ruby is labeled as httpd_sys_content_t

I went and looked at policy, I saw.

sesearch -A -s httpd_t -t httpd_sys_content_t -p execute -C
DT allow httpd_t httpdcontent : file { ioctl read write create getattr setattr lock append unlink link rename execute open } ; [ httpd_enable_cgi httpd_unified && httpd_builtin_scripting && ]

This shows that the httpd_t (Apache) process is allowed to execute the broker_ruby script if all of the following booleans are enabled.
httpd_enable_cgi, httpd_unified, httpd_builtin_scripting

Turns out the were.  I then went back and looked at the AVC.

type=AVC msg=audit(28/02/14 13:56:52.702:24992) : avc:  denied  { execute_no_trans } for  pid=6031 comm=PassengerHelper path=/var/www/openshift/broker/script/broker_ruby dev=dm-3 ino=817 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=unconfined_u:object_r:httpd_sys_content_t:s0 tclass=file

This AVC means that the Apache daemon (httpd_t) is not allowed to execute the broker_ruby application (httpd_sys_content_t) without a transition, meaning in the current label (httpd_t).

Which I understood, since when the above booleans are turned on httpd_t is supposed to transition to httpd_sys_script_t when executing httpd_sys_content_t.  This sesearch command shows the transition rule.

sesearch -T -s httpd_t -t httpd_sys_content_t -c process -C
DT type_transition httpd_t httpd_sys_content_t : process httpd_sys_script_t; [ httpd_enable_cgi httpd_unified && httpd_builtin_scripting && ]

Why wasn't the process transitioning?

Then I remembered that SELinux transitions do not happen on mounted partitions that are mounted with the nosuid flag.

man mount
       nosuid Do not allow set-user-identifier or set-group-identifier bits to take effect. (This seems safe, but is in fact rather  unsafe  if  you have suidperl(1) installed.)

SELinux designers feel that a transition can be a potential privilege escalation similar to a suid root application.  Therefore if an administrator has told the system that no suid apps should be allowed on a mount point, then it also means no SELinux transitions will happen.

Removing the nosuid flag from the mount point fixes the problem.

Containers your time is now. Lets look at Namespaces.
Lately I have been spending a lot of time working on Containers.  Containers are a mechanism for controlling what a process does on a system.

Resource Constraints can be considered a form of containerment.

In Fedora and RHEL we use cgroups for this, and with the new systemd controls in Fedora and RHEL7, managing cgroups has gotten a lot easier.  Out of the box all of your processes are put into a cgroup based on whether they are a user, system service or a Machine (VMs).  These processes are grouped at the unit level, meaning two users logged into a system will get and "Fair Share" of the system, even if one user forks off thousands of processes.  Similarly if you run an httpd service and a mariadb service, they each get an equal share of the system, meaning that httpd can not fork 1000 process while mariadb only runs three, the httpd 1000 processes can not dominate the machine leaving no memory of cpu for mariadb.  Of course you can go into the unit files for httpd or mariadb and add a couple of simple resource constraints to further limit them


MemoryLimit: 500m

to httpd.service  unit file

For example will limit the service to only use 500 megabytes to httpd processes.

Security Containment

Some could say I have been working on containers for years since SELinux is a container technology for controlling what a process does on the system.  I will talk about SELinux and advanced containers in my next blog.

Process Separation Containment

The last component of containers is Namespaces.  The linux kernel implements a few namespaces for process separation.  There are currently 6 namespaces.

Namespaces can be used to Isolate processes. They can create a new environment where changes to the process are not reflected in other namespace.
Once set up, namespaces are transparent for processes.

Red Hat Enterprise Linux  and Fedora currently support 5 namespace

  • ipc

  • ipc namespace allows you to have shared memory, semaphores with only processes within the namespace.

  • pid

  • pid namespace eliminates the view of other processes on the system and restarts pids at pid 1.

  • mnt

  • mnt namespace allows processes within the container to mount file systemd over existing files/directories without affecting file systems outside the namespace

  • net

  • net namespace creates network devices that can have IP Addresses assigned to them, and even configure iptables rules and routing tables

  • uts

  • uts namespace allows you to assign a different hostname to processes within the container. Often useful with the network namespace

Rawhide also supports the user namespace.  We hope to add the user namespace support to a  future Red Hat Enterprise Linux 7.

User namespace allows you to map real user ids on the host to container uids.  For example you can map UID 5000-5100 to 0-100 within the container.  This means you could have uid=0 with rights to manipulate other namespaces within the container.  You could for example set the IP Address on the network namespaced ethernet device.  Outside of the container your process would be treated as a non privileged process.  User namespace is fairly young and people are just starting to use it.

I have put together a video showing namespaces in Red Hat Enterprise Linux 7.

file_t we hardly new you...
file_t disappeared as a file type in Rawhide today.  It is one of the oldest types in SELinux policy.  It has been aliased to unlabeled_t.

Why did we remove it?

Let's look at the comments written in the policy source to describe file_t.

# file_t is the default type of a file that has not yet been
# assigned an extended attribute (EA) value (when using a filesystem
# that supports EAs).

Now lets look at the description of unlabeled_t

# unlabeled_t is the type of unlabeled objects.
# Objects that have no known labeling information or that
# have labels that are no longer valid are treated as having this type.

Notice the conflict.

If a file object does not have a labeled assigned to it, then it would be labeled unlabeled_t.  Unless it is on a file system that supports extended attributes then it would be file_t?

I always hated explaining this, and we have finally removed the conflict for future Fedora's.  Sadly this change has not been made in RHEL7 or any older RHELs or Fedoras.

We also added a type alias for unlabeled_t to file_t.

Note: Seandroid made this change when the policy was first being written.

One other conflict I would like to fix is that a file with a label that the kernel does not understand, is labeled unlabeled_t. (IE It has a label but it is invalid.)  I have argued for having the kernel differentiate the two situations.

  • No label -> unlabeled_t

  • Invalid Label -> invalid_t.

Upstream has pointed out from a practical/security point of view you really need to treat them both as the same thing.  Confined domains are not allowed to use unlabeled_t objects.  And if it is a file system object you should run restorecon on it.  Putting a legitimate label on the object.  Probably I will not get this change, but I can always hope. 


Log in