Dan Walsh's Blog

Got SELinux?

SELinux & PaaS: Deep Dive on Multi-tenancy, Containers & Security with Dan Walsh, Red Hat

SELinux & PaaS: Deep Dive on Multi-tenancy, Containers & Security with Dan Walsh, Red Hat

Last week I went to Portland, OR for the OpenShift Origin Day.
I gave a talk about SELinux and OpenShift.

The talk covered  the importance of MAC and container/namespaces when using a multi-tenant environment like OpenShift.

The talk also covers enhancements I want to make to OpenShift gears (containers) and additional features that we will be adding.

The video has been posted to youtube.

What is the differences between user_home_dir_t and user_home_t
I just saw this email on the SELinux Fedora Mailing list and figured I would try to answer it here.

Lets look at the labels of content in my  home directory.

> ls -lZd / /home /home/dwalsh /home/dwalsh/.ssh /home/dwalsh/.emacs /home/dwalsh/public_html
dr-xr-xr-x. root   root   system_u:object_r:root_t:s0      /
drwxr-xr-x. root   root   system_u:object_r:home_root_t:s0 /home
drwx------. dwalsh dwalsh unconfined_u:object_r:user_home_dir_t:s0 /home/dwalsh
-rw-r--r--. dwalsh dwalsh unconfined_u:object_r:user_home_t:s0 /home/dwalsh/.emacs
drwxr-xr-x. dwalsh dwalsh unconfined_u:object_r:httpd_user_content_t:s0 /home/dwalsh/public_html
drwx------. dwalsh dwalsh unconfined_u:object_r:ssh_home_t:s0 /home/dwalsh/.ssh

As we go through the different files and directories that make up my home directory we see multiple SELinux labels.  Notice the files that are owned by me have an SELinux user of staff_u.  On most systems these would be labeled unconfined_u.  In SELinux every file that is created by a process running as unconfined_u will get the unconfined_u SELinux user placed on the file label.  Similarly staff_u will get staff_u, ...

This component is ignored on most SELinux systems, system_u on /home here is the default label set by restorecon or at install time.

object_r is just a place holder.  For all SELinux systems other then some experimental systems, every object on the file system gets labeled object_r.

The last field is all s0.  This is the MCS or MLS label depending on your policy.  On most systems this will be s0 (SystemLow)

SELinux is a type enforcement system, and the interesting parts is the types.  Lets look at each directory/file above and the types associated with them.

  • / is labeled with the root_t type.  This should be the only object on the system with the root_t type, if you have other objects on the computer with this label, then you were probably running in permissive mode and a confined process created it, you need to fix the labels.  The main purpose of root_t is for policy writers to define transitions. for system applications that are going to create content in /.  For example boot flags will get created with etc_runtime_t.  Random directories created in / will get labeled default_t.

#  touch /.autorelabel; ls -lZ /.autorelabel
-rw-r--r--. root root unconfined_u:object_r:etc_runtime_t:s0 /.autorelabel
# mkdir /foobar; ls -ldZ /foobar
drwxr-xr-x. root root unconfined_u:object_r:default_t:s0 /foobar

         One thing to note about default_t is that NO confined apps are allowed to read content labeled default_t, because we have no idea what kind of content would be in this directory.  Invariably when some one creates a new directory in / without setting the labels, SELinux will have issues.  If foobar above contained apache content you would need to execute the following in order to allow apache to read the content.

# semanage fcontext -a -t httpd_sys_content_t '/foobar(/.*)?'
# restorecon -R -v /foobar

  • /home is labeled with the home_root_t type.  home_root_t is basically used for the toplevel or root directory for homedirs.   It's main purpose is to be used by policy for applications that need to create home directories or file system quota files.

# sesearch -T -t home_root_t
Found 10 semantic te rules:
   type_transition quota_t home_root_t : file quota_db_t;
   type_transition sysadm_t home_root_t : dir user_home_dir_t;
   type_transition firstboot_t home_root_t : dir user_home_dir_t;
   type_transition useradd_t home_root_t : dir user_home_dir_t;
   type_transition lsassd_t home_root_t : dir user_home_dir_t; 
   type_transition oddjob_mkhomedir_t home_root_t : dir user_home_dir_t;
   type_transition smbd_t home_root_t : dir user_home_dir_t;
   type_transition automount_t home_root_t : dir automount_tmp_t;

  • /home/dwalsh is labeled with the user_home_dir_t type.    Policy writers use this type to write transition rules to get content labeled correctly within the homedir.  Confined applications that create content in a users home directory need transition rules to create the content with the proper label. For example if you use ssh-copy-id on the remote machine, this triggers sshd to create the /home/dwalsh/.ssh directory, which we want labeled ssh_home_t, to protect the content.

# sesearch -T -s sshd_t -t user_home_dir_t | grep "\.ssh"
type_transition sshd_t user_home_dir_t : dir ssh_home_t ".ssh";

sesearch -T -s staff_t -t user_home_dir_t |wc -l

Note there are over 118 transition rules for a confined user creating content in his home directory.  With the advent of file name transition rules, we have exploded the number of these transitions in order to get proper labeling in the users home dir.

  • /home/dwalsh/.emacs is labeled user_home_t, which is the default label for all content in a users home directory.  This is the label that users are allowed to read/write and manage.  We try to prevent confined applications from being able to read and write this content, since this is where users store private content like credit card data, passwords shared secrets etc.  When we have a confined application that needs to access this content we usually wrap it in a boolean like ftp_home_dir.  Or we create a new type for this content, like mozilla_home_t or ssh_home_t.

  • /home/dwalsh/.ssh is labeled ssh_home_t, and contains content that only user types and sshd is allowed to read. Most other confined domains are not allowed to view content in this directory since it contains content like your secret keys.

  • /home/dwalsh/public_html is labeled httpd_user_content_t , which by default is the only place apache process is allowed to read in the home directory.  Allowing apache to read .ssh or say your .mozilla directory is just asking for trouble.

Note:  In order for apache or sshd to read their content they need to be allowed to search and getattr on every directory in the path.  The httpd would need to search root_t, home_root_t, user_home_dir_t, httpd_user_content_t.  It is not allowed to do this by default and needs the httpd_enable_homedirs boolean turned on.

# sesearch -A -b httpd_enable_homedirs -c dir -C | grep -v nfs | grep -v samba
Found 20 semantic av rules:
DT allow httpd_sys_script_t home_root_t : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_sys_script_t user_home_dir_t : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_user_script_t user_home_type : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_t user_home_type : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_user_script_t home_root_t : dir { ioctl read getattr lock search open } ; [ httpd_enable_homedirs ]
DT allow httpd_t home_root_t : dir { ioctl read getattr lock search open } ; [ httpd_enable_homedirs ]
DT allow httpd_user_script_t user_home_dir_t : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_t user_home_dir_t : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_suexec_t user_home_type : dir { getattr search open } ; [ httpd_enable_homedirs ]
DT allow httpd_suexec_t home_root_t : dir { ioctl read getattr lock search open } ; [ httpd_enable_homedirs ]
DT allow httpd_suexec_t user_home_dir_t : dir { getattr search open } ; [ httpd_enable_homedirs ]

Getting back to the question that drove me to create this blog.

What is the differences between user_home_dir_t  and user_home_t?

user_home_dir_t should only be the label of the users home directory but not of any content in the users home directory.
user_home_t is the label for generic content in a users homedir.

If you want to see all the labels that effect the users homedir, you can look at /etc/selinux/targeted/contexts/files/file_contexts.homedirs.

grep /home /etc/selinux/targeted/contexts/files/file_contexts.homedirs |wc -l

Understanding MCS Separation.
We designed a new way of using MCS, Multi Category Security, when we developed sVirt (Secure Virtualization) a few years ago.

To understand MCS it is helpful to understand something about MLS.

When Red Hat Enterprise Linux 5 came out, we had a done a lot of work adding MLS, Multi Level Security.  MLS requires the Belle & La Padula model of security.  This model takes into account the "Level and Category" of the data, and adds rules like a higher level process (TopSecret) is not allowed to write down (Secret) and a lower level (Secret) process is not allowed to read a higher level data (TopSecret).  The simple description of categories would be in order to read or write your process must have all the categories of the target.    Levels can go from s0 (SystemLow) to s15 (SystemHigh).  Then you can have any combination of 1024 categories.

In MCS we stole the idea of categories and dropped the concept of levels.

As I mentioned above the process category most include ALL of the categories of the target, otherwise MCS separation would block the access.   It is best to see an example of this.

Say I have a process labeled system_u:system_r:svirt_t:s0:c1,c2 from MCS Separation point of view.  This process would be allowed to access any file with any of these 4 MCS labels.

  • s0

  • s0:c1

  • s0:c2

  • s0:c1,c2

Since most files on a targeted system have the MCS label s0.  MCS Labeling does not block much mcs constrained applications.

By convention MCS confinement always uses two categories, and the categories can not be the same.  s0:c1,c1 == s0:c1.  Secondarily tools that launch MCS separate apps like libvirt and sandbox, make sure they never launch two processes with the same MCS label.  Which means we guarantee that two svirt_t always have two MCS labels that no other svirt_t or svirt_image_t would have.  Therefore a process running svirt_t:s0:c1,c2 would not be prevented by  MCS from access to any file with a MCS label of s0 or s0:c1,c2.  It would be denied from reading any file labeled s0:c1,c3 or s0:c2,c4 or s0:c4,c6.

But looking at MCS separation as the only control, misses out on type enforcement controls.

svirt_t access

We define our virtual machines as svirt_t type, and then we define rules about which types an svirt_t type is allowed access to.  If I wanted to see which types svirt_t is allowed to write, I could execute the following command.

# sesearch -A -s svirt_t -c file -p write -C | grep open | grep -v ^D
   allow virt_domain svirt_tmp_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ;
   allow svirt_t xen_image_t : file { ioctl read write getattr lock append open } ;
   allow virt_domain svirt_image_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ;
   allow virt_domain svirt_tmpfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ;
   allow virt_domain virt_cache_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ;
   allow virt_domain qemu_var_run_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ;
   allow virt_domain anon_inodefs_t : file { ioctl read write getattr lock append open } ;
   allow virt_domain svirt_home_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ;
   allow svirt_t svirt_t : file { ioctl read write getattr lock append open } ;
ET allow virt_domain dosfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_usb ]
ET allow virt_domain usbfs_t : file { ioctl read write getattr lock append open } ; [ virt_use_usb ]

This means a confined virtual machine running as svirt_t:s0:c1,c2 would ONLY allowed to write to the following file types IFF they are MCS labeled s0 or s0:c1,c2

svirt_tmp_t, svirt_image_t, svirt_tmpfs_t, virt_cache_t, qemu_var_run_t, svirt_home_t,

dosfs_t and usbfs_t iff the virt_use_usb boolean is turned on, perhaps we should turn this off by default.

anon_inodefs_t and svirt_t are not file types, svirt_t is a process types which would be in /proc.

libvirt and the kernel need to ensure that non of these files get labeled with s0.

Openshift domains are allowed to write/open

Kernel file systems anon_inodefs_t, openshift_t,hugetlbfs_t,  security_t

openshift_tmp_t, openshift_rw_file_t, openshift_tmpfs_t

MCS Constrained Types

Thee following types are MCS Constrained.

seinfo  -amcs_constrained_type -x

Audit2allow should be your third option not the first.
Whenever I talk about SELinux lately, I make everyone stand up and say.

SELinux is a labeling system.
Every process has a label every object on the system has a label.  Files, Directories, network ports ...
The SELinux policy controls how process labels interact with other labels on the system.
The kernel enforces the policy rules.

The follow up to this is, if you get a denial the First thing you should think is, perhaps one of the labels is WRONG.

I wrote a paper  a while ago called the SELinux Four Things. In which I point out the four causes of SELinux denials.

  1. You have a labeling problem.

  2. You have changed a process configuration, and you forgot to tell SELinux about it.

  3. You have a bug in either an application or SELinux policy.  Either SELinux policy did not know an application could do something, or the application is broken.

  4. You have been hacked.

The solution to #3 is to report a bug and use audit2allow to generate a local policy module and install it until either the application is fixed or SELinux policy adds rules to allow it.

The problem i see is that administrators seem to go to this option when ever they see a denial.   Lets look at a recent example from email.

In a recent email on selinux@lists.fedoraproject.org, Richard reported:

I have a CGI application named "mapserv" that needs to write to a specific location: "/rwg/mapserver/tmp"

He figured out that Apache content was usually labeled httpd_sys_content_t, which was a step in the right direction, so I guess he labeled this rectory tree as

Then he asked about writing policy that looked like.

module test 1.0;

require {
        type httpd_sys_content_t;
        type httpd_sys_script_t;
        class dir add_name;
        class file { write create };

#============= httpd_sys_script_t ==============
allow httpd_sys_script_t httpd_sys_content_t:dir add_name;
allow httpd_sys_script_t httpd_sys_content_t:file { write create };

This policy obviously came from audit2allow, and would work, except for a couple of problems, the biggest being that all CGI Scripts would be able to read write all Apache content.  The policy is also fragile since you might get an AVC like httpd_sys_script_t is not allowed to append to httpd_sys_content_t.

manage_files_pattern(httpd_sys_script_t, httpd_sys_content_t, httpd_sys_content_t) Would have been better.

Is there a better label I could have used?

But the main point of this blog would be that Richard should not have used a audit2allow module.  He could have used httpd_sys_rw_content_t for the tmp directory.

# semanage fcontext -a -t httpd_sys_rw_content_t "/rwg/mapserver/tmp(/.*?)"
# restorecon -R -v /rwg/mapserver/tmp

This would have been choice "1" above, one of the labels is wrong.

Could I change the process label?

Dominic Grift pointed out, in a mail reply, another solution.

cat > mywebapp.te << EOF
policy_module(mywebappp, 1.0.0)

make -f /usr/share/selinux/devel/Makefile mywebapp.pp
sudo semodule -i mywebapp

Now you can use the following new types:

httpd_mywebapp_script_t (mywebapp process type)
httpd_mywebapp_script_exec_t (mywebapp cgi executable file type)
httpd_mywebapp_content_t (mywebapp readonly file type)
httpd_mywebapp_content_rw_t (mywebapp read/write file type)
httpd_mywebapp_content_ra_t (mywebapp read/append file type)
httpd_mywebapp_htaccess_t (mywebapp htaccess file types)

Basically you can just label the cgi script with the mywebapp script executable file type and then the mywebapp process will run with the mywebapp process type creating files with the mywebapp content file types.

This solution satisfies "1", changing both the process label and the target label.  This is probably the best solution, since if Richard used other CGI scripts on his machine, only the mapserver cgi script would be able to write the mapserver cgi content, and he would have better separation.

Note:  I would have used sepolicy generate --cgi PATHTO/mapserver.cgi to generate the policy.

Could I modify the SELinux configuration to allow this access?

A third option would be to turn on the httpd_unified boolean.  (This would an option of type "2").  Although not the best solution for the problem.

httpd_unified is described as "Unify HTTPD handling of all content files."

# setsebool -P httpd_unified 1

This means it will allow the apache process and default apache scripts to read/write/execute all default labeled apache content.

Bottom line, when you see an SELinux AVC, the way your decision tree should go something like the following.

  1. Is the process that is being denied running with the correct label?  Could I make it run with a better label?

  2. Does the target object have the correct label or could I assign it a label that would allow the access from the process label?

  3. Is their an SELinux Boolean that would allow the access?

  4. If this is a network port being denied, did I modify the default settings and could I modify the labels on network packets using semanage port.

  5. Do I believe this is a bug in an application?  If yes, I need to report a bug?

  6. Is this a bug in SELinux policy and the access should be allowed by default?  I should install a local modification and report this as a bug.

  7. Am I being hacked?

setroubleshoot attempts to help you walk through this process, and newer versions of audit2allow show you booleans and potential labels you can use.

New Security Feature in Fedora 19 Part 2: Shared System Certificates
One of the cool things about writing this series of blogs for each Fedora Release is finding out about the changes is different parts of the OS outside of SELinux.  I love getting suggestions of security topics to blog on.  If you know of security topics I should cover send them to me at dwalsh@redhat.com or tweet @rhatdan

Shared System Certificates

Currently Tools like NSS (Mozilla products like Firefox/Thunderbird), GnuTLS, OpenSSL and Java on a Fedora box ship their own public key certificates and their own trust relationships. This means if an administrator wants to add/modify/delete trust to certain Certificates, he might have to modify several different stores in order to get the correct security.

A new feature in Fedora 19 is a system wide trust store of static data to be used by crypto toolkits as input for certificate trust decisions.

This feature the tools listed above a default source for retrieving system certificate anchors and black list information.  Fedora 19 will be the first step toward development of a comprehensive solution.

Look at the feature for more information on the changes, but the following two sections explain the key benefits to this feature and how users will use it.

Benefit to Fedora

The goal is to empower administrators to configure additional trusted CAs, or to override the trust settings of CAs, on a system wide level, as required by local system environments or corporate deployments. Although this is theoretically possible today, it's extremely hard to get right.

Fedora will immediately gain a unified approach to system anchor certificates and black lists. This is then built on in the future to be a comprehensive solution.

User Experience

Administrators will be able to use a tool to add a certificate authority file to the system trust store and have it recognized by all relevant applications.

Users will stop being surprised by incoherent and unpredictable trust decisions when using different applications with websites/services which require custom trust policy.

New Security Feature in Fedora 19 Part 1: New Confined/Permissive Process Domains
Each Fedora we release a bunch of new domains that will run in permissive mode for the release.  When the next release is released, the permissive domains are made enforcing.

In my blog,10 things you probably did not know about SELinux.. #4, I describe how you can interact with permissive domains.

In Fedora 18, we added 8 new permissive domains, all of  which are now enforcing in Fedora 19.

Fedora 18 Permissive Domains/ Now Confined in Fedora 19
  pkcsslotd_t (daemon manages PKCS#11 objects between PKCS#11-enabled applications)
   slpd_t  (Server Location Protocol Daemon)
   sensord_t (Sensor information logging daemon)
   mandb_t  (Cron job used to create /var/cache/man content)
   glusterd_t (policy for glusterd service)
   stapserver_t (Instrumentation System Server) Note: This was back ported to Fedora 17.
   realmd_t (dbus system service which manages discovery and enrollment in realms and domains like Active Directory or IPA)
   phpfpm_t (FastCGI Process Manager)

Fedora 19 Permissive Domains

       systemd-localed is a system service that may be used as mechanism to
       change the system locale settings, as well as the console key mapping
       and default X11 key mapping.  systemd-localed is automatically
       activated on request and terminates itself when it is unused.

       systemd-hostnamed is a system service that may be used as mechanism to
       change the system hostname.  systemd-hostnamed is automatically
       activated on request and terminates itself when it is unused.

       systemd-sysctl.service is an early-boot service that configures
       sysctl(8) kernel parameters.

       mythtv cgi scripts used for managing mythtv scheduling and content.

        OpenShift System Cron jobs run as openshift_cron_t, not gear cron jobs.

        OpenStack Object Storage (swift) aggregates commodity servers to work together
         in clusters for reliable, redundant, and large-scale storage of static objects.

Fedora 19 Modules Removed
shutdown.pp (Command no longer supported, functions suplanted by systemd)
consoletype.pp(Command should no longer be used, suplanted by systemd-logind)

Fedora 19 Modules Renamed or Consolodated
amavis.pp clamav.pp - These have been consolodated into a unified view of antivirus.pp, All aliased o antivirus_t.
    typealias antivirus_t alias { amavis_t clamd_t clamscan_t freshclam_t } ;

ctdbd.pp changed name upstream to ctdb.pp
isnsd.pp changed name upstream to isnd.pp
pacemaker.pp and corosync.pp rgmanager.pp aisexec.pp - These have been consolodated into a unified view of rhcs.pp, all aliased to the new type cluster_t.
     typealias cluster_t alias { aisexec_t corosync_t pacemaker_t rgmanager_t }

Security Vs Usability
One of the interesting things about working in the security field, is walking the balance between security and usability.  

On one of the many mailing lists I read, an admin as complaining about his Apache server being hacked.  Some application had been hacked in a way that it got apache to write to a particular directory and then executed the code it has written.

I decided to look at how SELinux Policy controlled httpd_t.  I wanted to know what file types was httpd_t allowed to write and execute.

In Fedora 19, I executed the following command.

> sesearch -A -C -s httpd_t -p execute -c file | grep write
DT allow httpd_t httpdcontent : file { ioctl read write create getattr setattr lock append unlink link rename execute open } ; [ httpd_enable_cgi httpd_unified && httpd_builtin_scripting && ]

This indicates that the apache deamon (httpd_t) is allowed to write and execute files that have a label that has the attribute httpdcontent.  But they would have to have the httpd_enable_cgi, httpd_unified, and httpd_builtin_scripting booleans turned on.

#semanage boolean -l | grep httpd_enable_cgi
httpd_enable_cgi               (on   ,   on)  Allow httpd cgi support
# semanage boolean -l | grep httpd_unified
httpd_unified                  (off  ,  off)  Unify HTTPD handling of all content files.
# semanage boolean -l | grep httpd_builtin_scripting
httpd_builtin_scripting        (on   ,   on)  Allow httpd to use built in scripting (usually php)

Out of the box we enable httpd_enable_cgi, so cgi scripts will run with your apache server.  We also enable httpd_builtin_scripting, which allows you to run php scripts within the same processes as apache, this also enabled other builtin scripting tools like mod_python and mod_perl.

We disable httpd_unified, which basically says httpd_t has full access to all httpdcontent files.

# seinfo -ahttpdcontent -x

So rather then treating each type differently we combine all access.  We used to have this turned on by default for people who did not understand SELinux, probably still is in RHEL5 and maybe RHEL6.  But in latest Fedora and RHEL7 we will turn it off by default.

If you are running a web site that does not do any scripting, it would probably be advisable to turn off the other two booleans.

SELinux Reveals Bugs in Code
I have noticed over the years random confined process that get avc denials for the SYS_RESOURCE and SYS_ADMIN capabilities.  Most of the time, these are not easily repeated.  The combination of these two usually indicate a confined processes is attempting to use a system resources beyond the limits for the owner UID.  For example in RHEL6 a user, dwalsh, is only allowed to run 1025 processes, and an individual process running as dwalsh, is only allowed to open 1024 files.

/usr/include/linux/capability.h documents SYS_RESOURCE as the following

/* Override resource limits. Set resource limits. */
/* Override quota limits. */
/* Override reserved space on ext2 filesystem */
/* Modify data journaling mode on ext3 filesystem (uses journaling
   resources) */
/* NOTE: ext2 honors fsuid when checking for resource overrides, so
   you can override using fsuid too */
/* Override size restrictions on IPC message queues */
/* Allow more than 64hz interrupts from the real-time clock */
/* Override max number of consoles on console allocation */
/* Override max number of keymaps */

The goal of these limits is to prevent an individual user from doing a fork bomb or opening so many files the system gets a Denial of Service.

Even root processes are governed by these limits,  however root processes almost always have SYS_ADMIN and SYS_RESOURCE capabilities, unless they have dropped them using something like libcap/libcap-ng or you are using an Mandatory Access System like SELinux.

Lon Hohberger, who is working on the OpenStack team at Red Hat and currently working on making SELinux work well with OpenStack, discovered some problems in RHEL6, that could and probably are triggering these AVC's.

Bug #1

Prior to RHEL6.4 ANY login to the system, including root,  would get a process limit of 1024.

# ulimit -u

Meaning if you started any processes on a system that already had 1025 processes running, the kernel would be checking SYS_RESOURCE.  If you executed a command like

# service httpd restart

Then httpd would fall under the same limits, since httpd_t is not allowed these Capabilities, httpd_t would generate the AVC's and probably fail to start.

Worse then this, if you executed:

# yum -y update

There is a decent chance that during the update some packages post install would do a

# service foobar restart

If foobar was confined by SELinux, then the AVC could be generated.

Luckily we have had a fix for this in RHEL6.4,  although this fix has not gone into Fedora yet...

The following line as added to /etc/security/limits.d/90-nproc.conf

root soft nproc unlimited

BUG #2

First if you login as root to a RHEL6.4 system, and check the max processes limit, you will get something like:

# ulimit -u

If you login as a normal user you would get:

> ulimit -u

If you run su from your normal user account and check the ulimit you get:

# ulimit -u

But if you run sudo as a normal user and run ulimit you get:
# ulimit -u

su and sudo should work the same way.

This is probably a bug in sudo or in the sudo pam stack.  Have not determined which yet.

BUG #3

This one might be controversial, since these limits are supposed to count the resources used by a particular UID, we should be looking at ALL of the processes kicked off by the user, not only those running under his UID.    Since sudo in the above example was not modifying the maximum running processes when user dwalsh became root, the process started counted against the total number of root processes rather then the total number of processes started by dwalsh.  I believe this is wrong.  The kernel should be counting the number of processes started by user dwalsh and should look at the LoginUID rather then the actual UID.  Now if dwalsh logged onto a system and was able to get some processes running as root, they could continue to count against dwalshs resource constraints and not against the systems.    I realize this might be difficult since you would probably want processes that have the SYS_RESOURCE capability to not count against the total.

systemd to the rescue

systemd has helped fix a lot of these issues.

In RHEL6 if an admin starts/restarts a system daemon, that daemon ends up being a subprocess of the user, meaning inherits any of the constraints on the user processes.  In the latest Fedora's, systemd starts most system daemons.  The user process sends a message to systemd and systemd starts/restarts the service, which means the service gets the system constraints not the user constraints.

Where did audit2allow go in Fedora 18?
One of the goals of Fedora 18 is to shrink the size of the Minimal install as much as possible.   We are concentrating on keeping this minimal.

Since a minimal install machine does not need to do SELinux policy development, it was decided to remove the selinux-policy-devel package from the minimal install, which is fairly large.  But audit2allow was in policycoreutils-python, and required the selinux-policy-devel package.  audit2allow needs the interface files in selinux-policy-devel package in order for  audit2allow -R to work. 

I decided to move the audit2allow script to policycoreutils-devel package, since its main job is to help develop selinux-policy.
If you do not find audit2allow you your system, just

yum install policycoreutils-devel
yum install /usr/bin/audit2allow

Note: The latest setroubleshoot-server package requires policycoreutils-devel, so if you have this installed you will get audit2allow for free...

rsync and SELinux
I have been pinged by a couple of users having problems with SELinux and rsync.  We began confining the rsync service back in RHEL5.

The biggest problem SELinux has with rsync is there is no way to distinguish between the client and the server from an SELinux point of view.

rsync as a daemon

If someone sets up an rsync service to listen for connections they use the /usr/bin/rsync executable.  In order to confine this application we label /usr/bin/rsync as rsync_exec_t.  The init daemons (init_t, initrc_t) will transition to rsync_t when they execute /usr/bin/rsync. SELinux policy allows share parts of the host, mainly readonly, and allows admins to setup directories labeled rsync_data_t where content could be uploaded to the rsync domain.

There are lots of booleans defined for rsync_t to share data.  On Fedora 18, I see.

getsebool -a | grep rsync
postgresql_can_rsync --> off
rsync_anon_write --> off
rsync_client --> off
rsync_export_all_ro --> off
rsync_use_cifs --> off
rsync_use_nfs --> of

man rsync_selinux

to see more info.

rsync as a client.

If you execute rsync as a client from a user script, everything works fine, since we do not transition from unconfined_t or other user domains to rsync_t, rsync runs within the user domain and is able to read/write anything the user process label is allowed to read/write.

If a service that SELinux does not have policy runs runs within the init system and attempts to use rsync as a client it can have problems.  You see the service running within the init system that has no policy will run as either init_t or initrc_t.  When a process running as init_t or initrc_t executes /usr/bin/rsync (rsync_exec_t), the rsync process will transition to rsync_t and SELinux will treat it as the rsync daemon not as a client. 

There are many possible solutions for this problem. 
  • Best would be to write policy for the init service that is currently running without confinement.   I realize  that most users will not do this, but you could contact us for help.
  • In RHEL5 you could turn on the rsync_disable_trans boolean.  Which will stop the transition from initrc_t to rsync_t, and rsync_t would just tun in  initrc_t domain, which by default is an unconfined domain.
  • You could use audit2allow to add all of the rules to rsync_t to allow it to run as a client.
  • You could change the label of /usr/bin/rsync to bin_t using semanage fcontext -m -t bin_t /usr/bin/rsync which would also stop the transition.
  • You could make rsync_t an unconfined domain.
There are many ways of fixing this problem.  And perhaps I need to talk to the rsync packagers to see if we could figure a better way of handling this in the future. 

You are viewing danwalsh