Linux Capabilities in Docker and Podman
When running Docker (or Podman) containers, sometimes you may encounter Operation not permitted
error messages even if you are using the root
user or sudo
. This is because the root
in the container does not have full root
permissions. This permission control is implemented through Linux capabilities. This article will first introduce the concept of Linux capabilities, then use Docker as an example to introduce how to adjust the Linux capabilities of containers, and finally introduce the differences between Docker and Podman in default capabilities, providing reference for container developers and users.
Linux Capabilities
The classic Linux permission control model divides users into ordinary users and privileged users (such as the root
user and users with sudo
permissions). Privileged users have all permissions on the system, which can easily lead to security issues. For example, a web server needs privileges to listen on ports 443 or 80, but should not access other users' files or modify the system kernel; if the web server is compromised, the attacker will gain all permissions on the system.
Linux capabilities divide the privileges in the system into multiple different capabilities, which can reduce the risk of the system by granting processes partial privileges instead of full root
permissions. For example, the CAP_NET_BIND_SERVICE
capability allows a process to bind ports less than 1024 without requiring full root
permissions. The list of Linux capabilities can be viewed with man 7 capabilities
.
Linux Capabilities in Docker
Unlike servers, containers do not need full root
permissions because the purpose of containers is to run one or more specific applications, not the entire system. For example:
- Containers usually do not need to manage networks and logs because the network and logs of the container are managed by the Docker Engine.
- Containers usually do not need to set the time because the time of the container is provided by the host machine.
- Containers usually do not need to run the
reboot
command because the lifecycle of the container is managed by the Docker Engine.
Therefore, Docker restricts the capabilities of containers by default through a whitelist, that is, containers only have specific capabilities by default. The capabilities used by Docker can be viewed here.
If you want to further restrict the capabilities of the container to increase security, you can remove capabilities with the --cap-drop
option. If the program in the container does need certain capabilities, you can add these capabilities with the --cap-add
option. For container developers, it is recommended to clearly state in the README when additional capabilities are needed.
Differences in Capabilities between Podman and Docker
Podman achieves higher security than Docker by further restricting the capabilities of containers. The default values of Podman's capabilities can be viewed here.
Podman's default capabilities are stricter than Docker's, so containers running in Podman may encounter more Operation not permitted
errors, such as sudo
will not be able to use the CAP_AUDIT_WRITE
capability. If Podman users encounter Operation not permitted
errors when running containers while others cannot reproduce them, it is likely due to the additional restrictions of Podman's capabilities.
References
Linux Capabilities in Docker and Podman
https://blog.caomingjun.com/linux-capabilities-in-docker-and-podman/en/