Mar 4, 2022
A general and less tooling-intensive approach to automating routine project tasks.
It’s fairly common for the authors and maintainers of a software project to accumulate routine tasks as the project proceeds. Common examples of this:
Many projects also sprout more esoteric, project-specific tasks - as I write this, I’m working on one where “delete an AMI from AWS” is one of the things I’ve had to automate.
Many software projects also end up relying on a build tool - make
, classically, but there are myriad others. These are tools that are designed to solve specific classes of routine problems (in fact, generally the ones listed above).
I’ve often seen these two factors converge in projects in the form of a build tool script that contains extensive task-specific support, or in some cases where a build tool such as make
is pressed into service as a generic task service and does none of the work it was designed to do in the first place.
This has a couple of consequences.
First, the configuration language for these tools is invariably unique to the tool. make
is the least prone to this, but Makefile
s require careful understanding of both shell and make variables, Make’s unique implicit rule system, and the huge number of variables and rules implied by the use of make
in the first place.
Adding utility tasks to projects using scons, Gradle, Cargo, Maven, or any other, more modern build tool involves similar assumptions and often a very specific programming language and execution model. Even tools like Gradle, which use a general-purpose language, impose a unqiue dialect of that language which is intended to facilitate the kinds of build tasks the tool is designed around.
So, writing and understanding the utility tasks that exist requires specific skills, which are orthogonal to the skills needed to understand or implement the project itself.
Second, it creates a dependency on that tool for those tasks. They can only be executed automatically when the tool is available, and can only be executed by hand (with all that entails), or reimplemented, when the tool is absent. Often, build tools are not expected in the end-user environment. Projects either end up having to different approaches for tasks that might run in both development and end-user environements versus tasks that run in only one of the two.
tools
approachTo address those consequences, I’ve started putting routine tasks into my project as shell scripts, in a tools
directory.
The shell is a widely-deployed, general-purpose automation tool that handles many routine administration tasks well. For software which will be delivered to a unix-like end environment, developers can be confident that a shell will be present, and the skills to write basic shell scripts are generally already part of a unix programmers’ repertoire.
These scripts follow a few principles to ensure that they remain manageable:
Always use set -e
, use set -o pipefail
if appropriate. A tool that needs to run multiple steps should abort, naturally and automatically, if any of those steps fail.
Usually chdir to the project’s root directory (generally with cd "$(dirname "$0")/.."
). Normally, manipulating the cwd makes shell scripts brittle and hard to deploy, but in this use case, it ensures that the script can reliably manipulate the project’s code. This step can be skipped if the tool doesn’t do anything with the project, but it’s usually worth keeping it anyways just for consistency.
Include a block comment near the top with a usage summary:
## tools/example
## tools/example FILENAME
##
## Runs the examples. If FILENAME is provided, then only the
## example in FILENAME will actually be executed.
Minimize the use of arguments. Ideally, use only either no arguments, one argument, or an arbitrary number of arguments. Under no circumstances use options or flags.
Minimize the use of shell constructs other than running commands. This means minimizing the use of if
, for
, while
, and set
constructs, shell functions, $()
and backticks, complex pipelines, eval
, reading from the environment, and otherwise. This is not a hard and fast rule, and the underlying intent is to keep the tool script as straightforward and as easily-understood as possible.
Where a tool needs to run multiple steps, generally break those steps out into independent tool scripts. Processes are cheap; clarity is valuable.
More generally, the intention is to encapsulate the commands needed to perform routine tasks, not to be complex software in their own right.
If you’re using a project-specific shell environment (with direnv
or similar), add tools
to your PATH
so that you can run tool scripts from anywhere in the project. Being able to type build
and have the project build, regardless of where you’re looking at that moment, is very helpful. For direnv
’s .envrc
this can be done using PATH_add tools
.
No approach is perfect, and this approach has its own consequences:
Tasks can’t participate in dependency-based ordering on their own. They’re just shell scripts. This can be troubling if a tool does something that needs to take place during ordinary builds.
There’s always the temptation to Add Features to tools scripts, and it takes steady effort and careful judgment as to how to do so while hewing to the goals of the approach. For example, with Docker, I’ve had situations where I end up with two tools with nearly-identical Docker command lines (and thus code duplication to maintain) because I preferred to avoid adding an optional debug mode to an existing tools script.
If you’re not using direnv
, the tools directory is only easily available to your shell if your pwd
is the project root. This is awkward, and sh
derivatives don’t provide any convenient way to “search upwards” for commands.
Tool scripts need to take care to set their own pwd
appropriately, which can take some thinking. What directory is appropriate depends on what the tool script needs to accomplish, so the right answer isn’t always obvious. A convention of “always cd to the project root” covers most cases, but there are exceptions where the user’s pwd
or some other directory may be more appropriate.
This site is built using two tools. tools/build
:
#!/bin/bash -e
cd "$(dirname "$0")/.."
## tools/build
##
## Converts the content in docs/ into a deployable website in site/
exec mkdocs build
And tools/publish
:
#!/bin/bash -e
cd "$(dirname "$0")/.."
## tools/publish
##
## Publishes site/ to the S3 bucket hosting grimoire.ca
exec aws s3 sync --delete site/ s3://grimoire.ca/
Another project I’m working on has a tool to run all CI checks - tools/check
:
#!/bin/bash -ex
# tools/checks
#
# Runs all code checks. If you're automating testing, call this rather than
# invoking a test command directly; if you're adding a test command, add it here
# or to one of the tools called from this script.
tools/check-tests
tools/check-lints
tools/check-dependencies
This, in turn, runs the following:
tools/check-tests
:
#!/bin/bash -ex
# tools/check-tests
#
# Checks that the code in this project passes incorrectness checks.
cargo build --locked --all-targets
cargo test
tools/check-lints
:
#!/bin/bash -ex
# tools/check-lints
#
# Checks that the code in this project passes style checks.
cargo fmt -- --check
cargo clippy -- --deny warnings
tools/check-dependencies
:
#!/bin/bash -ex
# check-dependencies
#
# Checks that the dependencies in this project are all in use.
cargo udeps --locked --all-targets
Yet another project uses a tool to run tests against a container:
#!/bin/bash -e
cd "$(dirname "$0")/.."
## tools/check
##
## Runs tests on the docker image.
VERSION="$(detect-version)"
py.test \
--image-version "${VERSION}"
Casey Rodarmor’s just
stores shell snippets in a project-wide Justfile
.
This allows for reliable generation of project-specific help and usage information from the Justfile
and allows just foo
to work from any directory in the project without project-specific shell configuration.
On the other hand, it’s another tool developers would need to install to get started on a project. Justfile
entries can’t be run without just
(a problem make
also has).