I’ve been using Postgres in Docker for continuous integration. This allows you to script a long-running data setup script, and allows testing up and down migrations between versions.
Here are the lessons learned:
- If you want to publish data with the image, copy the Dockerfile for the version of Postgres you use, and comment out the “VOLUME” command – this will force the container to retain the data directory in the image.
- In my experience these Dockerfiles break with regularity, so you’ll need to continuously update yours from the source repository.
- If you want to enforce security, your best option is to change the arguments to initdb in docker-entrypoint.sh (e.g. –auth=md5 –auth-host=md5 –auth-local=md5). This may be necessary if you want to use dblink to connect two databases.
- If you use unlogged tables, you’ll need to shut down the database with “docker container stop” before doing “docker commit”. Without this, you’ll lose all data in these tables.
- If you use semver, I find it helpful to publish multiple versions of the database (e.g. 1.2.3, 1.2.LATEST, 1.LATEST, LATEST). This allows downstream consumers to relax their requirements – e.g. in the case of migrations, it may be valuable to test a migration against each point release of a prior snapshot.