Rootless Docker: Avoiding Common Caveats

A comprehensive guide to steer clear of potential pitfalls when setting up rootless Docker.

Joey Miller • Posted July 17, 2023

At a glance:

Setting up rootless Docker
Networking fixes
- Ensure Docker propagates source IP addresses
- Exposing privileged ports
Storage fixes

To increase security you should be using rootless Docker where you can.

Docker containers, and containers as a whole, are really just a regular program wrapped in some extra protections provided by the kernel (namely cgroups etc) to create isolation, and other interesting features.

Unlike VMs, containers run closer to the host operating system, so close they use the same kernel, meaning it’s even more important to protect it.

Running Docker as a non-root user limits the container runtime's access to the underlying host system. This minimizes the risk of privilege escalation attacks that could potentially compromise the entire system. Rootless Docker also ensures better separation between containers, reducing the risk of one container affecting others on the same system.

Docker has wide support and availability, but Podman also offers a similarly mature and integrated solution. Podman is architected from the ground up to be daemonless and rootless. Podman also aims to be a drop-in replacement for Docker - consider using Podman for your needs instead.

Setting up rootless Docker

First, let's install docker:

sudo apt install ca-certificates curl gnupg lsb-release
sudo mkdir -m 0755 -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce

Then, let's set up the Docker daemon to run as a non-root user

Note: Sometimes necessary XDG_RUNTIME_DIR and DBUS_SESSION_BUS_ADDRESS environment variables are not set properly in some cases, such as when using su to set up rootless docker for a different user. Make sure to set them as follows export XDG_RUNTIME_DIR=/run/user/$UID export DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/$UID/bus

sudo apt install uidmap dbus-user-session

# Disable the system-wide Docker daemon (optional)
sudo systemctl disable --now docker.service docker.socket

dockerd-rootless-setuptool.sh install

# Enable the rootless docker daemon on startup (optional)
systemctl --user enable docker
sudo loginctl enable-linger $(whoami)

As reported by the dockerd-rootless-setuptool.sh script, add the following to the bottom of your .bashrc:

export PATH=/usr/bin:$PATH
export DOCKER_HOST=unix:///run/user/$UID/docker.sock

Networking fixes

Ensure Docker propagates source IP addresses

If running rootless Docker, we need to do some additional configuration to ensure source IP addresses are propagated to containers/services. This may be more important for networking-related services such as Pihole - otherwise you may find all clients report the same IP.

Follow these instructions:

The source IP addresses can be propagated by creating ~/.config/systemd/user/docker.service.d/override.conf with the following content:
[Service]
Environment="DOCKERD_ROOTLESS_ROOTLESSKIT_PORT_DRIVER=slirp4netns"
And then restart the daemon:
systemctl --user daemon-reload
systemctl --user restart docker

Exposing privileged ports

When running some services, such as DNS or Nginx over HTTP/HTTPS, you may want to expose privileged ports (< 1024) on the host.

If running rootless Docker with the source IP propagation changes above:

Running setcap on the rootlesskit binary will not work. Follow these instructions:

To expose privileged ports (< 1024), add net.ipv4.ip_unprivileged_port_start=0 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl --system.

Otherwise

You can enable privileged ports for the rootless docker running on the user with setcap as follows.

sudo setcap cap_net_bind_service=ep $(which rootlesskit)
systemctl --user restart docker

Storage fixes

Because we are running the Docker daemon as a non-root user, the UID/GID is not mapped 1:1 into the container. Instead it is $(/etc/getuid + id). This means a user with uid=1000, gets mapped as uid=100999 if /etc/getuid is ubuntu 10000:65xxx.

This can cause issues when mounting a file/directory to a Docker container. In the above example the container will write any data as uid=100999. These files will now be inaccessible to the host user (that has uid=1000).

Inheritance ACL

[[..] there is no way to set the UID and GID as a mount option in docker.](https://stackoverflow.com/questions/30140911/can-i-control-the-owner-of-a-bind-mounted-volume-in-a-docker-image)

I prefer to put all persistent data in a specific directory (i.e. ./data) and then set an inheritance ACL so that even if folders are added/removed I always have access to them. Setting an ACL (Access Control List) allows us to define permissions and access rights on our files/directories (in addition to the regular chmod file modes).

Start by installing the acl package if you do not already have it installed:

sudo apt install acl

Then let's always allow rwX permissions for current and future files in ./data for the user ubuntu:

sudo setfacl -Rm d:u:ubuntu:rwX,u:ubuntu:rwX ./data

Note: rwX means read-write plus execute only if the file is a directory or already has execute permission.

Unable to write

Sometimes a container will complain about the permissions of a mounted volume.

Since we have set up an inheritance ACL in the previous step, we can simply have the default container user own the host directory. This will change the file mode on the host filesystem.

Let's say we have a folder ./data/conf mounted as /conf in the container my_service. Let's enter the container as root and chown the directory as the container's default user container_user.

docker exec -u 0 -it my_service bash
chown -R container_user /conf

Avoid mounting the entire volume

Some Docker images such as the linux-server ones allow for setting the UID/GID used in the container. For rootless Docker, sometimes it works to set UID=0, GID=0 (forcing the container to use the root user). This isn't recommended or supported.

If possible, consider mounting a more precise group of files as read-only.

With some Docker images such as linux-server/ddclient, we can read-only mount our own config file at a different mount-point (/defaults/ddclient.conf) meaning we won't need to worry about file permissions.

Tags

self-hostingdockercyber-security

If you found this post helpful, please share it around: