I get lots of bugs from people complaining about SELinux blocking access to the Docker socket. For example https://bugzilla.redhat.com/show_bug.cgi?id=1557893
The aggravating thing is, this is exactly what we want SELinux to prevent. If a container process got to the point of talking to the /var/run/docker.sock, you know this is a serious security issue. Giving a container access to the Docker socket, means you are giving it full root on your system.
Now I am getting bug reports about allowing containers access to this socket.
Access to the docker.sock is the equivalent of sudo with NOPASSWD, without any logging. You are giving the process that talks to the socket, the ability to launch a process on the system as full root.
Usually people are doing this because they want the container to do benign operations, like list which containers are on the system, or look a the container logs. But Docker does not have a nice RBAC system, you basically get full access or no access. I choose to default to NO ACCESS.
If you need to give container full access to the system then run it as a --privileged container or disable SELinux separation for the container.
podman run --privileged ...
docker run --privileged ...
podman run --security-opt label:disable ...
docker run --security-opt label:disable ...
Run it privileged
There is a discussion going on in Moby github, about breaking out more security options, specifically adding a flag to container runtimes, to allow users to specify whether they want kernel file systems in the container to be readonly. (/proc, /sys,...)
I am fine with doing this, but my concern with this is people want to make little changes to the security of their containers, but at a certain point you allow full breakout. Like above where you allow a container to talk to the docker.sock.
Security tools are being developed to search for things like containers running as --privileged, but they might not understand that --security-opt selinux:disable -v /run:/run is the SAME THING from a security point of view. If it is simple to break out of container confinement, then we should just be honest and run the container with full privilege. (--privileged).