The fundamental building block of our analysis platform is an analyzer.
Since the static analysis world works in many different languages and can
require many different libraries, each analyzer is its own Docker image, and the
Dockerfile is provided by the analysis author. We provide the analyzer's inputs
/analysis/inputs folder (where the list of inputs is determined by a
manifest file), and once the image has finished running, we look for its output
/analysis/output. Usually, we do this by bind-mounting a directory on the
host to the
/analysis folder; when we run in our CI environment on Circle, we
have to fall back to
docker cp since Circle's docker-in-docker solution uses a
remote docker daemon, meaning that the image isn't necessarily running on the
same machine as the code that launched it.
This seems like it'd work, and for a while, it did. But when we started running our client on Linux hosts, we ran into weird issues related to filesystem permissions.
A digression into POSIX filesystem permissions
Before getting into detail, let's explore the typical POSIX filesystem access control model. This model is shared by macOS, BSD, Linux, and other similar operating systems (notably, not Windows). If you're familiar with how they work, including what write and execute permissions mean on a directory, you can skip to the next section.
Each file has an owner, which is stored as a number known as a user ID, and a
group, which is similarly stored as a group ID. The permissions entry for a file
controls who can read, write, or execute the file, and this can be controlled
separately for the owner, for users in the file's group, and for all other users
(AKA 'other'). This is typically represented in a form like
the first triple corresponds to the user, the second corresponds to the group,
and the last triple corresponds to all other users. So for this example file,
the user can read, write, and execute it (
rwx); users in the file's group can
read and execute, but not write (
r-x); and other users can't read, write, or
This all makes sense for files, but what about directories? It turns out that the answer there is a bit subtle. For directories:
- Read permissions allow a user to list the names of all the files in a directory
- Write permissions allow the user to add, delete, and rename files in the directory, but only if the execute bit is set.
- Execute permissions allow
cd-ing into the directory.
The exact semantics of the various combinations are complicated, but the important thing is that in order to delete a file in a directory, you either need to own the directory or have write permissions to it.
Our Docker images all use some flavor of Linux as a base, and when running a Linux image on a Linux host with a bind mount, files have the same owner, group, and permissions within the image and on the host machine. And since we don't control what analyzer images do, an analyzer could very well be running as root inside its image. Which means that on the host, the output files will be owned by root. And so, if the output files contain a nonempty directory, we won't be able to delete them afterwards.
To see why, suppose that the mounted path on the host is
/tmp/data, which is
owned by the user running our CLI (who we'll call Alice). Then, suppose when
Alice runs the analyzer, it outputs a file located at
inside the image, and that this file and its containing directory are both owned
Then on the host, we'll have a directory
/tmp/data/foo that's owned by root,
and a file inside it named
bar that's also owned by root.
Then on the host, in order to delete
/tmp/data, we'll first have to delete
/tmp/data/foo, and in turn that requires deleting
you can't delete a nonempty directory. But both
/tmp/data/foo/bar and its
containing folder are owned by the root user, not Alice, and we can't rely on
having write permissions to it, so we can't delete it!
This didn't show up as a problem earlier for us for two reasons:
- Most of our developers use macOS as our daily workstations. Docker on macOS doesn't map filesystem permissions/ownership in the same way; instead, the way it all shakes out is that everything will be automatically owned by the user that ran our CLI, so we don't have any problems to begin with.
- Even in cases where we were running on Linux, most of our analyzers at the
time would only output a single
output.jsonfile, which gets mapped onto
/tmp/data/output.jsonon the host. And that dioesn't cause problems, since the CLI user owns
/tmp/dataand you can delete files in directories that you own.
What doesn't work
Running as the CLI user
Docker has options for running the entrypoint/command as a given user, so one
might think that we could just get the user running the r2c CLI and run the
entrypoint as that user. But an analysis image might install various software as
a user; for example, many of our own analyzers create an
analysis user and
install software from NPM as that user. And we don't want to require that
analysis authors make sure everything they set up internally is world-readable.
This also has another problem: while the UID and GID are shared between the
Docker container and the host image, user names aren't. This is because the
mapping between UID and username is stored in the
/etc/passwd file, which
isn't shared between the host and the Docker container. So if software tries to
look up the name of the current user, it'll fail, which can have surprising
and/or amusing effects:
$ me=$(id -u) # get the ID of the host's current user $ echo $me 501 $ docker run --user $me debian:latest whoami whoami: cannot find name for user ID 501 $ docker run -it --user $me debian:latest /bin/bash I have no name!@4c4fb54624c4:/$
Since this situation of not having a valid username accounts for a very small percent of use cases in most software, we expected this might trigger interesting edge cases in external software.. In combination with the burden it'd place on analysis authors, we rejected this approach.
Pass the UID/GID into the docker image
Docker has support for passing environment variables into an image at build time. So we could pass the UID, GID, and username of the user running the CLI into the image and require that analysis authors use that UID, GID, and username when setting up their image. However, this would mean that the end user of the analysis image would have to rebuild it the first time they run it, and that's a bad user experience. It also puts additional burden on analysis authors, and bugs related to users failing to do this would be hard to track down.
Instead of using bind mounts, we could just use volumes and then use
to copy data out of the filesystem. We're already doing this in the event of a
remote Docker daemon, such as if we're running in CI; in that case, you can't
use a bind mount since the Docker daemon won't even be on the same physical
machine! But volume mounts are less performant than bind mounts, and some of our
analyzers can output hundreds of megabytes, or even gigabytes, of data.
What does work
Eventually, we realized something: you can use Docker with bind mounts to change permissions! Specifically, a command like
docker run --mount source=/host/path,target=/vol alpine:3.9 chmod ugo+rwx /vol/rand
lets us change the permissions of a file on the host by mounting it inside a
Docker image and then chmodding it. (Here,
ugo+rwx means 'add
read/write/execute permissions to the user, group, and other'.). So we can just
run that on the analyzer image's input before it starts and on the image's
output after it finishes.
... except in the case where we're using a remote docker. In that case, bind
mounts won't work, so this command doesn't work. Not only that, we have a
docker cp makes the files inside the image owned as
by default and preserves the permissions, and since in many cases the output of
the previous analyzer won't be world-writable, the analysis author won't be able
to write to their input. And some analyses want to be able to do things like
npm install in their input, which requires the ability to write to it.
Fortunately, in this case, we can do something slightly different:
- Create a volume and copy the data from the host into the volume.
- Run the
chmod ugo+rwxcommand, but mounting the volume we just created instead of trying to bind-mmount.
- Run the analyzer with that volume mounted in the usual place.
- Fix up the permissions again as usual.
- Copy the files out, and delete the temporary volume we used for all of this.
And since both the local Docker and remote Docker cases follow the same pattern,
we can abstract all of this in a
DockerFileManager interface, instantiate one
based on whether we're running local Docker or remote Docker, and just call
various lifecycle methods on it. There's no need for branching control flow.