From 29638291532d0579db8d9186e50211c7835d9d40 Mon Sep 17 00:00:00 2001 From: Phill Kelley <34226495+Paraphraser@users.noreply.github.com> Date: Tue, 21 Jan 2025 12:19:29 +1100 Subject: [PATCH] chronograf can crash when using Docker bind mounts MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Assume a Docker "bind mount" is used to map Chronograf's persistent store. Examples: * a `docker run` command: ``` $ docker run -v ./chronograf:/var/lib/chronograf chronograf ``` * these lines in a `docker compose` service definition: ``` volumes: - ./chronograf:/var/lib/chronograf ``` Prior to starting the container, Docker tries to ensure that the *external* path to the persistent store exists via the equivalent of: ``` $ sudo mkdir -p ./chronograf ``` The practical result is that any path component that didn't exist beforehand is created and owned by root. Make two assumptions (typical "first launch" conditions): 1. That `./chronograf` did not exist so Docker has just created the `chronograf` folder with root ownership; and 2. That Docker launches the container as root (the default). In the absence of passing `CHRONOGRAF_AS_ROOT`, the first-time user is then in the situation where: 1. the persistent store is owned by root; 2. the container launches as root but downgrades its privileges to user `chronograf` (userID 999); 3. the executable is then unable to write into its persistent store. It crashes with the error message: ``` time="«timestamp»" level=error msg="Unable to create bolt clientUnable to open boltdb; is there a chronograf already running? open /var/lib/chronograf/chronograf-v1.db: permission denied" ``` 4. Depending on how the container was launched, it then either halts or goes into a restart loop (eg if `restart: unless-stopped`). Currently, there are two solutions to this: 1. The user passes the `CHRONOGRAF_AS_ROOT` environment variable with the value `true`; or 2. The user manually adjusts ownership on the persistent store: ``` $ sudo chown -R 999:999 ./chronograf ``` Option 1 defeats the purpose of running with reduced privileges. Option 2 isn't documented so it is an example of "hidden knowledge". The user has to: * recognise that the service is not running (which is not always immediately obvious to inexperienced users); * know to consult `docker logs -f chronograf` (the `-f` being particularly important if the container is in a restart loop); * be able to interpret the error message correctly (ie that "permission denied" is the critical element); * realise that changing ownership on the persistent store is the correct response; and * know to use userID 999 in the `chown`. It would be preferable if the container handled these situations correctly for itself, which is the main goal of this Pull Request. This problem does not occur if a *named volume mount* is used rather than a *bind mount*. That is because of the "copy" step whereby Docker recursively copies the internal path to the external path before the Unix-bind-mount association is formed. The last path component of the volume mount (ie the `_data` folder) is then owned by userID 999. Even if `CHRONOGRAF_AS_ROOT` is `true`, root can still write into that folder. If the container is launched *without* an explicit volume mapping, a new *anonymous volume mount* is created each time the container is recreated, but otherwise behaves the same as a *named volume mount*. This is a side-effect of the Dockerfile declaration: ``` VOLUME /var/lib/chronograf ``` > Removing the `VOLUME` statement would avoid this side-effect. In that case, `/var/lib/chronograf` would only exist inside the container while it was running and would not persist. Neither would there be a steady accumulation of unused anonymous volume mounts. Although the default for Docker is to launch the container as root, it is also possible to use either the `-u` option (`docker run`) or `user:` clause (`docker compose`) to have Docker launch the container as some other user. In this situation, with the exception of userID 999, the container will lack the privileges to write to `/var/lib/chronograf` so it will abort with the permission error mentioned above, and the user will also have to know which userID to employ to set up the persistent store. This Pull Request tries to deal with that possibility by writing a hint into the log. For example, if the container is launched as userID 1000 but doesn't have write permission for `/var/lib/chronograf`, the user would see: ``` You need to change ownership on chronograf's persistent store. Run: sudo chown -R 1000:1000 /path/to/persistent/store ``` Signed-off-by: Phill Kelley <34226495+Paraphraser@users.noreply.github.com> --- chronograf/1.10/alpine/entrypoint.sh | 15 ++++++++++++--- chronograf/1.10/entrypoint.sh | 15 ++++++++++++--- 2 files changed, 24 insertions(+), 6 deletions(-) diff --git a/chronograf/1.10/alpine/entrypoint.sh b/chronograf/1.10/alpine/entrypoint.sh index bb8d32b50..3c4869166 100755 --- a/chronograf/1.10/alpine/entrypoint.sh +++ b/chronograf/1.10/alpine/entrypoint.sh @@ -9,8 +9,17 @@ if [ "$1" = 'chronograf' ]; then export BOLT_PATH=${BOLT_PATH:-/var/lib/chronograf/chronograf-v1.db} fi -if [ "$(id -u)" -ne 0 ] || [ "${CHRONOGRAF_AS_ROOT}" = "true" ]; then - exec "$@" -else +if [ $(id -u) -eq 0 ] ; then + if [ "${CHRONOGRAF_AS_ROOT}" != "true" ] ; then + chown -Rc chronograf:chronograf /var/lib/chronograf exec su-exec chronograf "$@" + fi + chown -Rc root:root /var/lib/chronograf +else + if [ ! -w /var/lib/chronograf ] ; then + echo "You need to change ownership on chronograf's persistent store. Run:" + echo " sudo chown -R $(id -u):$(id -u) /path/to/persistent/store" + fi fi + +exec "$@" diff --git a/chronograf/1.10/entrypoint.sh b/chronograf/1.10/entrypoint.sh index 8a68b024f..a77de93dc 100755 --- a/chronograf/1.10/entrypoint.sh +++ b/chronograf/1.10/entrypoint.sh @@ -9,8 +9,17 @@ if [ "$1" = 'chronograf' ]; then export BOLT_PATH=${BOLT_PATH:-/var/lib/chronograf/chronograf-v1.db} fi -if [ "$(id -u)" -ne 0 ] || [ "${CHRONOGRAF_AS_ROOT}" = "true" ]; then - exec "$@" -else +if [ $(id -u) -eq 0 ] ; then + if [ "${CHRONOGRAF_AS_ROOT}" != "true" ] ; then + chown -Rc chronograf:chronograf /var/lib/chronograf exec setpriv --reuid chronograf --regid chronograf --init-groups "$@" + fi + chown -Rc root:root /var/lib/chronograf +else + if [ ! -w /var/lib/chronograf ] ; then + echo "You need to change ownership on chronograf's persistent store. Run:" + echo " sudo chown -R $(id -u):$(id -u) /path/to/persistent/store" + fi fi + +exec "$@"