I have noticed over the years random confined process that get avc denials for the SYS_RESOURCE and SYS_ADMIN capabilities. Most of the time, these are not easily repeated. The combination of these two usually indicate a confined processes is attempting to use a system resources beyond the limits for the owner UID. For example in RHEL6 a user, dwalsh, is only allowed to run 1025 processes, and an individual process running as dwalsh, is only allowed to open 1024 files.
/usr/include/linux/capability.h documents SYS_RESOURCE as the following
/* Override resource limits. Set resource limits. */
/* Override quota limits. */
/* Override reserved space on ext2 filesystem */
/* Modify data journaling mode on ext3 filesystem (uses journaling
/* NOTE: ext2 honors fsuid when checking for resource overrides, so
you can override using fsuid too */
/* Override size restrictions on IPC message queues */
/* Allow more than 64hz interrupts from the real-time clock */
/* Override max number of consoles on console allocation */
/* Override max number of keymaps */
The goal of these limits is to prevent an individual user from doing a fork bomb or opening so many files the system gets a Denial of Service.
Even root processes are governed by these limits, however root processes almost always have SYS_ADMIN and SYS_RESOURCE capabilities, unless they have dropped them using something like libcap/libcap-ng or you are using an Mandatory Access System like SELinux.
Lon Hohberger, who is working on the OpenStack team at Red Hat and currently working on making SELinux work well with OpenStack, discovered some problems in RHEL6, that could and probably are triggering these AVC's.
Prior to RHEL6.4 ANY login to the system, including root, would get a process limit of 1024.
# ulimit -u
Meaning if you started any processes on a system that already had 1025 processes running, the kernel would be checking SYS_RESOURCE. If you executed a command like
# service httpd restart
Then httpd would fall under the same limits, since httpd_t is not allowed these Capabilities, httpd_t would generate the AVC's and probably fail to start.
Worse then this, if you executed:
# yum -y update
There is a decent chance that during the update some packages post install would do a
# service foobar restart
If foobar was confined by SELinux, then the AVC could be generated.
Luckily we have had a fix for this in RHEL6.4, although this fix has not gone into Fedora yet...
The following line as added to /etc/security/limits.d/90-nproc.conf
root soft nproc unlimited
First if you login as root to a RHEL6.4 system, and check the max processes limit, you will get something like:
# ulimit -u
If you login as a normal user you would get:
> ulimit -u
If you run su from your normal user account and check the ulimit you get:
# ulimit -u
But if you run sudo as a normal user and run ulimit you get:
# ulimit -u
su and sudo should work the same way.
This is probably a bug in sudo or in the sudo pam stack. Have not determined which yet.
This one might be controversial, since these limits are supposed to count the resources used by a particular UID, we should be looking at ALL of the processes kicked off by the user, not only those running under his UID. Since sudo in the above example was not modifying the maximum running processes when user dwalsh became root, the process started counted against the total number of root processes rather then the total number of processes started by dwalsh. I believe this is wrong. The kernel should be counting the number of processes started by user dwalsh and should look at the LoginUID rather then the actual UID. Now if dwalsh logged onto a system and was able to get some processes running as root, they could continue to count against dwalshs resource constraints and not against the systems. I realize this might be difficult since you would probably want processes that have the SYS_RESOURCE capability to not count against the total.
systemd to the rescue
systemd has helped fix a lot of these issues.
In RHEL6 if an admin starts/restarts a system daemon, that daemon ends up being a subprocess of the user, meaning inherits any of the constraints on the user processes. In the latest Fedora's, systemd starts most system daemons. The user process sends a message to systemd and systemd starts/restarts the service, which means the service gets the system constraints not the user constraints.
Dan Walsh's Blog
- SELinux Reveals Bugs in Code