This repository contains a docker compose configuration useful to run TEI Publisher and associated services.Docker compose allows us to orchestrate and coordinate the various services, while keeping each service in its own, isolated environment. Setting up a server via docker compose is fast as everything comes preconfigured and you don't need to install all the dependencies (like Java, eXist-db, Python etc.) by hand. On the downside, it certainly introduces some overhead and may never be as fast as a server, which is properly maintained. For smaller, low-traffic projects docker is a viable and cheap alternative though.
For security reasons, it is recommended to not expose TEI Publisher and eXist-db directly, but instead protect them behind a proxy. The docker-compose configuration therefore sets up an nginx reverse proxy.
The following services are configured by the docker-compose:
- publisher: main TEI Publisher application
- ner: TEI Publisher named entity recognition service
- frontend: nginx reverse proxy which forwards requests to TEI Publisher
- certbot: letsencrypt certbot required to register an SSL certificate
- cantaloupe: a IIIF server for images
In this repository we provide two ways to set up and configure a server:
- simple setup: ideal for local testing or projects which need to build and expose a single application. Installing a server involves a few manual steps like copying files and running a command.
- ansible: fully-automated setup without manual steps. Also suited for projects serving multiple applications, optionally under different domain names. More information can be found below.
Clone this repository to either your local machine or a server you are installing. By default it will build and deploy the main TEI Publisher application from the master branch. The named entity recognition and cantaloupe iiif services are pulled as ready-made images. If you do not need or want the named entity recognition or cantaloupe service, comment out the corresponding sections in docker-compose.yml
, including the depends_on:
for ner and/or cantaloupe found in the file. TEI Publisher will still work.
By default, the compose configuration will launch the proxy on port 80 of the local host, serving only http, not https. This configuration is intended for testing, not for deployment on a public facing server. To start the services on localhost using the default configuration, run:
docker compose up -d --build
Afterwards you should be able to access TEI Publisher using http://localhost. The cantaloupe IIIF service will be mapped to the path /iiif
, so for testing try: http://localhost/iiif/2/test.tif/full/full/0/default.jpg. See below for more information.
To stop the services again, call:
docker compose down
The default configuration exposes the TEI Publisher application itself. Instead you may want to build and deploy a custom application generated by TEI Publisher. Clone or fork tei-publisher-docker-compose
to apply some modifications. By default our configuration uses the included Dockerfile to build the application.
Note: if you already have a Dockerfile in your app repository, change BUILD_CONTEXT
in .env to point to the git repository in which your Dockerfile resides. If it is a private repository, it may require an access token (see the commented out examples). You can set tokens in the .env file as well. You may then skip the rest of this chapter and continue reading below.
For the included Dockerfile
we use configuration variables to point to the application to be built. The relevant variables start with APP_
and should be set in the .env environment file. The file currently includes this:
# Name of the custom app to include - should correspond to the name of the repository
APP_NAME=tei-publisher-app
# Tag or branch to build
APP_TAG_OR_BRANCH=v8.0.0
# GIT repository to clone the app from
APP_REPO=https://github.com/eeditiones/tei-publisher-app.git
# eXist-db path the root of the server will be mapped to. Specifying a path here
# will map the root to one single app. Change to empty string if you want to expose the
# entire database and also set CONTEXT_PATH=auto below.
ROOT_PATH=/exist/apps/tei-publisher
# App context path: set to 'auto' if the app should be exposed under its full path
# (e.g. /exist/apps/tei-publisher)
CONTEXT_PATH=""
# Name of the server - irrelevant on localhost
SERVER_NAME=example.com
# When building from a Dockerfile located in a private repo, you may need to set access tokens
ACCESS_TOKEN_NAME=xxx
ACCESS_TOKEN_VALUE=yyy
The default settings will build version 8 of TEI Publisher. To build your own custom app instead, change the three variables to point to your git repo, a tag/branch on it (e.g. master
), and the name of the app. The latter should correspond to the repository name.
If your app requires additional resources, for example, because you keep your data files in a separate data package, you will have to modify the Dockerfile
to include the necessary extra build steps.
We also assume you're app is compatible with the libraries used by TEI Publisher 8. If not, you'll have to change the specified versions in the Dockerfile
accordingly:
ARG TEMPLATING_VERSION=1.1.0
ARG PUBLISHER_LIB_VERSION=4.0.0
ARG ROUTER_VERSION=1.8.0
To test your changed configuration locally, rebuild and restart the services (see commands below).
Rent a cloud server which ideally has docker enabled. There are various offers on the market. A good specification would include 4 gb of RAM and 2 vCPU, which you can get for less than 10 Euro per month. We recommend to select a service which already provides a pre-configured docker installation as part of the operating system. Installing docker by hand is easily possible, but addressing security concerns in the right way requires care and a bit of reading. It saves some work if the provider has already dealt with this.
Once you have root access to your server, ssh into it and clone the docker compose configuration repository.
Configuring the server requires four steps:
- modify
docker-compose.yml
to set the domain name and root path - enable the HTTP proxy configuration
- launch the service once to acquire an SSL certificate
- enable the HTTPS proxy configuration
Assuming that you have a domain name for the server, edit .env and set SERVER_NAME
:
# eXist-db path the root of the server will be mapped to
ROOT_PATH=/exist/apps/tei-publisher
# Name of the server - irrelevant on localhost
SERVER_NAME=example.com
ROOT_PATH
should correspond to the database path under which your application is found.
For security reasons we always hide eXist-db and TEI Publisher behind a proxy. The proxy redirects incoming requests to the configured services and blocks everything else. Our setup uses 3 configuration templates:
Template file | Description |
---|---|
localhost.conf.template | default (enabled) for local testing only |
default.conf.off | disabled template for HTTP access |
default.ssl.conf.off | disabled template for HTTPS access |
For the full server setup, we have to disable the localhost configuration and enable the HTTP configuration to acquire an SSL certificate. Once this is done, we can enable the HTTPS configuration as well.
Go to the nginx/templates
directory and either remove localhost.conf.template
or rename it to localhost.conf.off
, then rename default.conf.off
to default.conf.template
.
Before we can fully access the server, we need to get an SSL certificate. To do so:
-
Start the services (
docker compose up -d
) -
Run the following command to request an SSL certificate for your domain, replacing the final
example.com
with your domain name:docker compose run --rm certbot certonly --webroot --webroot-path /var/www/certbot/ -d example.com
This will ask you for an email address, verify your server and store certificate files into
certbot/conf/
.
Rename nginx/templates/default.ssl.conf.off
to nginx/templates/default.ssl.conf.template
, thus enabling the ssl configuration.
Now restart the services:
docker compose restart
Build (without cache) and start all services
docker compose up -d --build --no-cache
Display log output of the TEI Publisher service (i.e. eXist-db logs)
docker compose logs publisher
Stop all services:
docker compose down
The two most important configuration settings can be found in .env:
# eXist-db path the root of the server will be mapped to. Specifying a path here
# will map the root to one single app. Change to empty string if you want to expose the
# entire database and also set CONTEXT_PATH=auto below.
ROOT_PATH=/exist/apps/tei-publisher
# App context path: set to 'auto' if the app should be exposed under its full path
# (e.g. /exist/apps/tei-publisher)
CONTEXT_PATH=""
The ROOT_PATH
variable specifies, which database URLs will be mapped to the root, in other words: what users will see if they access the root of the server (e.g. http://localhost
). Using the setting above, users will be presented with the entry page of the TEI Publisher app. Other applications running on the database cannot be accessed.
If – for testing purposes – you would like to expose the entire database, set ROOT_PATH
to the empty string. In this case you should also change the CONTEXT_PATH
setting in the publisher
section of docker-compose.yml
to the value auto
:
# Set to 'auto' if the app should be exposed under its full path (e.g. /exist/apps/tei-publisher)
# use empty string if the app is mapped to the root of the server (see nginx config below)
CONTEXT_PATH=auto
The LetsEncrypt SSL certificate issued above will only be valid for a certain duration and needs to be renewed from time to time. We'll thus install a cron job, which calls the script certbot-renew.sh
once every day to check if the certificate needs to be renewed.
Register a cron job to call this script once a day. Call crontab -e
and add a line:
59 18 * * * /root/my-edition-docker/certbot-renew.sh
replacing /root/my-edition-docker
with the correct path to wherever you cloned the configuration.
If you would like to create regular backups of the data in your eXist-db:
- edit
docker-compose.yml
and enable the volume mapping for/exist/backup
:# uncomment to map eXist-db backups to local directory - ./backup:/exist/backup
- retrieve the eXist-db configuration file from the running docker container with
docker compose cp publisher:/exist/etc/conf.xml .
- edit conf.xml and find the section referring to consistency checks and backups. Uncomment the system job, specify the backup directory and a time (cron syntax) to trigger the backup:
<job type="system" name="check1" class="org.exist.storage.ConsistencyCheckTask" cron-trigger="0 0 4 * * ?"> <parameter name="output" value="/exist/backup"/> <parameter name="backup" value="yes"/> <parameter name="incremental" value="no"/> <parameter name="incremental-check" value="no"/> <parameter name="max" value="2"/> </job>
- copy the
conf.xml
back to the docker container:docker compose cp conf.xml publisher:/exist/etc
- restart the container:
docker compose restart publisher
The included cantaloupe IIIF server reads images from a file system directory. By default this is mapped to the folder iiif
in the same directory as the docker-compose configuration:
volumes:
- ./iiif:/imageroot
To change the directory, replace the path before the ':'.
To access cantaloupe's admin interface, edit docker-compose.yml, set CANTALOUPE_ENDPOINT_ADMIN_ENABLED
to true
, define a secret and enable the port mapping:
cantaloupe:
image: uclalibrary/cantaloupe:5.0.5-7
environment:
CANTALOUPE_ENDPOINT_ADMIN_ENABLED: false
CANTALOUPE_ENDPOINT_ADMIN_SECRET: my_admin_pass
volumes:
- ./iiif:/imageroot
# comment in to enable access to cantaloupe on port 8182, including admin interface
# ports:
# - 8182:8182
The admin interface will be available at port 8182: http://localhost:8182/admin.
Ansible is an automation framework for managing servers. It uses an ssh connection between the controlling machine and the server to automatically work through a sequence of setup tasks (the playbook). Benefits:
- no manual steps required
- you can rebuild the same setup any time on the same or different target machines
- we can create more complex setups, i.e. multiple applications running on the same eXist-db database, each accessible through its own domain name. For example, you could expose app1 under https://app1.com and app2 under https://app2.com.
The obvious downside is that you need ansible installed on your local machine. Check the official documentation for instructions.
Once you have ansible (check if you can execute the command ansible-playbook
), clone this repository to your local machine (in ansible terms called the control node) and change into the ansible
subdirectory.
You obviously also need a server into which you can ssh. It is recommended to add the ssh configuration into your personal .ssh/config
. Furthermore you should have a DNS entry mapped to this server before you start.
Nearly all settings are configured via configuration files. First edit ansible/hosts and provide the name of you ssh key configuration:
all:
hosts:
# should point to an ssh configuration entry in your .ssh/config
workshop
Next open ansible/variables.yml to specify the apps to be built and the domains they should be served from. Most of the settings are similar (if not identical) to the once used for the simple docker-compose setup explained above.
The first setting to pay attention to is the list of apps to built into your docker image:
# list of apps to build. only relevant if context above is .
apps:
# Name of the custom app to include - should correspond to the name of the repository
- name: "tei-publisher-app"
# Tag or branch to build
tag_or_branch: "v8.0.0"
# GIT repository to clone the app from
repo: "https://github.com/eeditiones/tei-publisher-app.git"
The required information includes the name of the app, the tag or branch and the repository. You may list more than one app here and all of them will be included in the Dockerfile ansible generates for you.
Note: If you don't want ansible to generate a Dockerfile, but instead use one you supplied, just change the context
variable in the publisher.build
section above to point to the repository containing your Dockerfile, e.g.:
publisher:
build:
# directory or repository URL containing the Dockerfile to build from
context: https://github.com/eeditiones/tei-publisher-app.git#master
In this case you do not need to touch the list of apps to be built as it will be ignored.
One or more apps can then be mapped to a domain in the domains
section:
# the domains to be served by the proxy. Each domain may map to a different app and root path
# within the same eXist-db instance
domains:
- name: publisher
# eXist-db path the root of the server will be mapped to. Specifying a path here
# will map the root to one single app. Change to empty string if you want to expose the
# entire database and also set CONTEXT_PATH=auto below.
root: /exist/apps/tei-publisher
# the name the server should listen to
hostname: workshop.teipublisher.com
The name
can be any internal name used to identify the mapping. root
should contain the URL path into eXist-db which should be mapped to the root of the domain. hostname
is the domain name the server should listen to.
The ansible playbook will automatically try to register an SSL certificate for each domain. This service has rate limits, so if you hit it too often during testing, you'll be blocked for a while. To prevent this, set the variable use_staging
in the cert
section to true
. This means your SSL certificates won't be valid and your browser will complain, but you can test out if the process runs through properly.
By default the generated configuration will also start the IIIF image service and the Named Entity Recognition service. If you don't need those, disable them in the services
section:
# which optional services should be started?
services:
# IIIF image service
iiif: false
# Named entity recognition
ner: true
Once you are done editing the configuration, launch the ansible playbook with the following command:
ansible-playbook -i hosts site.yml
You will be prompted to specify a password for the eXist-db admin user. Choosing a strong admin password is highly recommended.
Note: the generated Dockerfile will be saved to ansible/Dockerfile.generated
. If you would like to use it for local testing, replace the Dockerfile
in the parent directory with this one. You can then test on your local machine as described above.
If you wish to update all apps to newer versions, but leave the generated docker and nginx configurations untouched, call
ansible-playbook -i hosts update.yml
Note that this will explicitely remove any data added to the database by users and rebuild the docker image!
To rebuild everything without having to renew the SSL certificate, call
ansible-playbook -i hosts --skip-tags clean,cert site.yml
This will prevent the existing directory from being removed and leaves the SSL certificate untouched. The docker and nginx configurations will still be recreated.
Run checks only:
ansible-playbook -i hosts --tags check site.yml
(Re-)generate Dockerfile locally:
ansible-playbook -i hosts --tags dockerfile site.yml