Make-fu

Setting up a Makefile for a Python Project

There’s a common saying among computer scientists that “A good programmer is a lazy programmer”. Coincidentally, there’s also a common saying among mathematicians that “A good mathematician is a lazy mathematician”. Being both a programmer and a mathematician I must be doubly lazy, and thus I often find myself spending lots of time coming up with ways to remove boring repetitive tasks. Because why spend x amount of time on boring tasks when you can spend 10x amount of time trying to avoid spending any time on boring tasks? I’m sure you agree. In the spirit of that line of thought, let’s get into the science of building makefiles.

A makefile is a standard file format to create small scripts that make your life as a programmer a lot easier. Or, at least ensure that you spend 10 seconds less on boring tasks. I’ve used them for around a year at this point, and as such the standard makefiles that I use in my projects have evolved along the way as well.

In this blog post I’ll be showing parts of my current standard makefile, along with various make-fu lessons that I’ve learnt along the way. At the very least it could give you some makefile code you could copy-paste and use in your own projects, but it could also be that you find some of the lessons I learnt along the way useful when you’re building your own makefiles.

What’s a makefile anyway?

Makefiles have a long history. It started all the way back from 1976 in Bell Labs, and were designed to be used solely for building executable programs. This is still what they’re mostly used for today (I think), but as Python developers we can still use the convenient utility of a small part of their functionality: creating small handy scripts for our projects.

A makefile is usually just named exactly that, makefile or Makefile, and generally has the following structure:

[make code]

recipe_name:
    [shell code]

another_recipe_name:
    [shell code]

The key bit of a makefile are the recipes, being the shell code blocks with names attached to them. With such a makefile in our root directory we can then call make recipe_name to call the associated shell script. If we simply call make it will automatically call the first recipe, in this case recipe_name.

Here’s an example of a working makefile:

hello:
    echo "Hello world!"

This contains no make code and contains just a single recipe, hello. If we thus call make hello, or even just make, from the terminal, we get the following output:

$ make
echo "Hello world!"
Hello world!

Making use of @'s

We see that running the recipe both prints the command being run as well as the output of the command itself. To only print the output, we can prefix each shell script line with a “@”, resulting in the following makefile:

hello:
    @echo "Hello world!"

Running this will now simply output Hello world!.

Loading in environment variables

We can also use variables within makefiles. A common use case, which I’ll also use here in this post, is using environment variables. Usually, these would be stored in a separate .env file, which looks like the following:

VARIABLE_NAME=Some value
ANOTHER_VARIABLE_NAME=Another value

To be able to use these values within a makefile, we would use add the line include .env in the make code at the top of our makefile. This will load in all the variables stored in the makefile and make them accessible in your recipes. We can then refer to these variables using dollar signs, as follows:

include .env

hello:
    @echo "Hello world!"
    @echo "The first variable value is ${VARIABLE_NAME}"
    @echo "The second variable value is ${ANOTHER_VARIABLE_NAME}"

This would then print out:

Hello world!
The first variable name is Some value
The second variable name is Another value

Recipes I use in most of my projects

The recipes that I’ve implemented and use regularly in my own projects are the following:

  • install, which installs also the dependencies of a project and generally sets the programmer’s environment up, making them ready to start working on the project straight away.
  • test, which runs all unit tests within the project as well as checking the test coverage and adding this as a badge in the readme.
  • docs and view-docs, which builds the documentation for the project and views the documentation in a browser, respectively.
  • publish-major, publish-minor and publish-patch, which both bumps the version of the Python project and publishes the project as a Python package on PyPI, the Python Package Index. This assumes the project is following the standard X.Y.Z version scheme, where the three recipes bumps X, Y and Z, respectively.

These recipes are of course specific to how I usually set up my Python projects. First and foremost, I use Poetry to manage my dependencies and make the packaging and publishing of the project easier. A combination of pip and setuptools can achieve the same, but I find it a bit more clunky, and I find that pip’s dependency resolver has got quite a lot worse as of late.

For my unit testing I use pytest, which is probably the de facto standard unit testing package in Python these days, so nothing special about that. As for my project documentaion needs, I’ve recently switched to using the pdoc package for automatically creating documentation for my projects, which I find a much better experience than Sphinx.

Lastly, I’ve started to always sign my Git commits, which ensures that others can trust that my commits are indeed created by me. This may sound a bit strange, but Git allows anyone to change their email and username associated with Git to anything at all, without verification. Thus, setting up the signing of all my commits automatically would make sure that others can trust that my changes are indeed made by me. To sign your commits you need to have GPG (Gnu Privacy Guard) installed, which can be done simply by running apt install gnupg on Ubuntu and brew install gnupg on MacOS (Windows is a bit more complicated). With GPG installed, you can create a new GPG key by running the command

$ gpg --full-generate-key

make install

This recipe is really the main recipe of my makefile, which I’ve changed and tuned quite a lot during the last handful of projects I’ve been working on. I’m quite pleased with how it turned out, but of course I might decide to change it again in the future.

This recipe does several things:

  1. It checks if poetry is installed already, and if not then it downloads it and installs it.
  2. It checks if GPG is installed, and warns the user if it’s not
  3. It installs all dependencies
  4. It ensures that all environment variables are set and stored in a .env file
  5. It sets up my Git configuration, including setting up commit signing

I’ve split these up into separate recipes, to make it a bit easier to read. The main recipe, install, thus looks like the following:

install:
    @echo "Installing..."
    @if [ "$(shell which poetry)" = "" ]; then \
        $(MAKE) install-poetry; \
    fi
    @if [ "$(shell which gpg)" = "" ]; then \
        echo "GPG not installed, so an error will occur. Install GPG on MacOS with "\
             "`brew install gnupg` or on Ubuntu with `apt install gnupg` and run "\
             "`make install` again."; \
    fi
    @$(MAKE) setup-poetry
    @$(MAKE) setup-environment-variables
    @$(MAKE) setup-git

make install-poetry

As I mentioned, the first thing we do is checking if poetry is already installed, and if not, we install it by calling the install-poetry recipe, which looks like this:

install-poetry:
    @echo "Installing poetry..."
    @curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python3 -
    @$(eval include ${HOME}/.poetry/env)

This downloads the poetry installer, installs it, and updates the PATH environment variable which makes it possible to use the poetry command immediately.

Poetry nugget: Note that poetry has recently updated their URL to the simpler https://install.python-poetry.org, but unfortunately that doesn’t seem to work on M1 MacBooks at this point in time, so I’m sticking with the old one until that gets fixed.

Make nugget: The last line of this recipe is actually make code! You might recognise it from above where we loaded in environment variables using include .env at the top of the makefile. We can run make code within a recipe by running $(eval [some make code]). In this case, Poetry creates a file, ~/.poetry/env, which only contains one line: export PATH="$HOME/.poetry/bin:$PATH". Since this only updates the environment variable PATH, just like our .env file, a simple include does the trick.

make setup-poetry

The next step is to install the project dependencies with poetry. This is relegated to another make recipe, setup-poetry, which looks like this:

setup-poetry:
    @poetry env use python3 && poetry install

This sets up a local virtual environment in which all the project dependencies will be installed, and installs all the dependencies within it. The virtual environment is built as a .venv folder within your project directory, just like you could do with the regular python -m venv .venv command, and tells poetry to use that virtual environment.

Poetry nugget: Poetry doesn’t create virtual environments within your projects like that as a default, but since that’s the way it is usually done, I found that it makes things more smooth by making the virtual environments compatible with the way they are normally set up using pip. To make this work properly, one needs a poetry.toml file in the root, with the following setup:

[virtualenvs]
create = true
in-project = true

make setup-environment-variables

To set up the .env file with environment variables, I created a separate fix_dot_env_file.py Python script, which you can see in full here, which ensures that the environment variables GIT_NAME, GIT_EMAIL, GPG_KEY_ID and PYPI_API_TOKEN are set, and prompts the user for them if they aren’t. In the case of GPG_KEY_ID, it starts by trying to guess your GPG key ID by running the command

$ gpg --list-secret-keys --keyid-format=long | grep sec | sed -E "s/.*\/([^ ]+).*/\1/"

This simply reads out the currently stored GPG key ID, assuming that it has been set up. If this doesn’t return anything then the user is prompted for one. If the user doesn’t supply any, then commit signing will simply not be enabled for this project. The recipe calling this script then looks like this:

setup-environment-variables:
    @poetry run python3 -m src.scripts.fix_dot_env_file

Make nugget: It’s actually important that this Python script is run in a separate recipe, as makefiles only read the environment variables from .env files (via the aforementioned include .env command) prior to running a recipe, so if you dynamically change the contents of the .env file then the recipe won’t be able to use the new/changed environment variables. That took me a while to figure out!

make setup-git

The last part of the install recipe consists of setting up the Git configuration for the project, which is done in the setup-git recipe. It initialises the Git repository and sets up the Git configuration with username, email and whether GPG signing of commits should be enabled or not:

setup-git:
    @git init
    @git config --local user.name ${GIT_NAME}
    @git config --local user.email ${GIT_EMAIL}
    @if [ ${GPG_KEY_ID} = "" ]; then \
        echo "No GPG key ID specified. Skipping GPG signing."; \
        git config --local commit.gpgsign false; \
    else \
        echo "Signing with GPG key ID ${GPG_KEY_ID}..."; \
        echo 'If you get the "failed to sign the data" error when committing, try running `export GPG_TTY=$$(tty)`.'; \
        git config --local commit.gpgsign true; \
        git config --local user.signingkey ${GPG_KEY_ID}; \
    fi
    @poetry run pre-commit install

A little warning is being shown if GPG signing is enabled, stating how to fix a “failed to sign the data” error. Basically, the GPG_TTY environment variable sometimes needs to be set up before GPG works.

make test

This is one of the simplest recipes in the makefile. It simply looks like this:

test:
    @poetry run pytest && readme-cov

Aside from calling poetry run pytest, which runs pytest no matter if the virtual environment of the project is activated or not, it also runs the readme-cov command from the readme-coverage-badger package, which generates a coverage badge for your readme file, without having to use any third-party coverage solutions such as Travis CI.

make docs and make view-docs

These two recipes concern the handling of the project documentation, using the pdoc package that I mentioned earlier. The first docs recipe simply creates the documentation, which will be stored in the docs folder, and it will assume that all docstrings in the project will follow the Google format (change this to your needs):

docs:
    @poetry run pdoc --docformat google src/PACKAGE_NAME -o docs
    @echo "Saved documentation."

This assumes that your source code lies in the src/PACKAGE_NAME folder, where PACKAGE_NAME is the name of the package that you’re working in. Again, change this to your own needs.

The next recipe, view-docs, well, views the built documentation. The documentation is a simple HTML file, so this just needs to be opened by the default browser. Unfortunately, opening a file has different commands for each operating system (open for MacOS, xdg-open for Linux and cygstart for Windows), so we have to define the appropriate “open command” first, after which we open the file. This can be achieved using the uname shell command, which outputs “Linux” for Linux distributions, “Darwin” for MacOS and “CYGWIN” for Windows:

view-docs:
    @echo "Viewing API documentation..."
    @uname=$$(uname); \
        case $${uname} in \
            (*Linux*) openCmd='xdg-open'; ;; \
            (*Darwin*) openCmd='open'; ;; \
            (*CYGWIN*) openCmd='cygstart'; ;; \
            (*) echo 'Error: Unsupported platform: $${uname}'; exit 2; ;; \
        esac; \
        "$${openCmd}" docs/PACKAGE_NAME.html

Make nugget: I mentioned above that ${varName} is used to refer to environment variables. In the case of local variables, i.e., variables that you define within your recipe (such as uname and openCmd in this recipe), we need to use two dollar signs, referring to them as $${uname} and $${openCmd}.

make publish-X

As I build and maintain a fair amount of Python packages, I wanted the makefile to make my life easier when working with these. The first such addition relates to version management of Python projects.

When building a Python package, you usually want a centralised place where your current version is located, which usually cause issues since you want your version to be accessible in your project configuration, which lies outside your project folder, as well as having a __version__ variable, to enable others to call your_package.__version__ to get the current version, which lies inside your project folder. To fix this, I set the version in the centralised pyproject.toml project configuration file, and within my package __init__.py file I fetch this version using the following snippet:

import pkg_resources

# Fetches the version of the package as defined in pyproject.toml
__version__ = pkg_resources.get_distribution("your_package").version

Aside from making it easy to fetch the current version, it should also be easy for me to bump the current version when working on a new release. This needs to change the version stated in pyproject.toml, update my change log as well as adding and committing these changes to the Git history. I set this up in a separate versioning.py Python script which you can checkout here, which makes it possible to bump the version a desired level (major, minor or patch).

Associated to each of these three levels I then have an associated recipe:

publish-major:
    @poetry run python -m src.scripts.versioning --major
    @$(MAKE) publish
    @echo "Published major version."

publish-minor:
    @poetry run python -m src.scripts.versioning --minor
    @$(MAKE) publish
    @echo "Published minor version."

publish-patch:
    @poetry run python -m src.scripts.versioning --patch
    @$(MAKE) publish
    @echo "Published patch version."

These simply bump the version and publishes the package. The publishing of the package is then relegated to the publish recipe, looking as follows:

publish:
    @if [ ${PYPI_API_TOKEN} = "" ]; then \
        echo "No PyPI API token specified in the '.env' file, so cannot publish."; \
    else \
        echo "Publishing to PyPI..."; \
        poetry publish --build --username "__token__" --password ${PYPI_API_TOKEN}; \
        echo "Published!"; \
    fi

It simply checks if the PYPI_API_TOKEN environment variable exists and isn’t empty, and if that’s the case then it builds and publishes the package simply by running the poetry publish --build command, which does both in one fell swoop (got to love poetry!).

Rounding up

That was quite a mouthful of make and shell code!

But with these recipes, you can now smoothly set up new projects, manage tests and documentation, as well as publish your projects as Python packages. Aside from helping yourself, it also makes it a lot easier for others (collaborators or colleagues) to easily get started on your code base without necessarily trying to guess what tools you use for each task.