-
Notifications
You must be signed in to change notification settings - Fork 25
How do I configure a Qanary pipeline using Docker?
In the simplest case you can start all relevant services (pipeline, components and triplestore)
in the same Docker network (most likely host
) and connect the services using default parameters only.
However, in most production use cases the parameters used for networking need to be changed - most importantly the host, port and triplestore settings.
This guide will go over different configurations that might be necessary for the Qanary pipeline (using the reference implementation).
There are several properties which might be configured, depending on the context:
-
server.host
: the host on which the pipeline will be listening -- for external configuration, the name isSERVER_HOST
-
server.port
: the port on which the pipeline will be listening -- for external configuration, the name isSERVER_PORT
If a Stardog triplestore is used, then the following properties need to be defined:
-
stardog.url
-- for external configuration, the name isSTARDOG_URL
-
stardog.username
-- for external configuration, the name isSTARDOG_USERNAME
-
stardog.password
-- for external configuration, the name isSTARDOG_PASSWORD
Note: You may implement your own triplestore connector that requires different properties.
Additionally, the property qanary.triplestore
provides a fallback option if:
- the Stardog triplestore is running on a different host to the pipeline
- you cannot define your own triplestore and no other options are available
SSL settings can be configured using the following properties:
-
server.ssl.enabled
: (boolean) enable SSL -
server.ssl.key-store
: the path to the key store containing the certificate, e.g.,classpath:keystore.p12
-
server.ssl.key-store-password
: the password used to access the key store -
server.ssl.key-store-type
: the type of key store (JKS or PKCS12)
Note: You can find more information about enabling SSL in this guide: How-do-I-improve-the-security-of-my-implementation
To override the default values set in application.properties
the use of environment variables is encouraged.
This is further described in this guide: How do I configure Qanary services using Docker containers
Port configuration
Usually, docker containers are self-contained. To access an application running on port 8080
inside a container, you need to map this port to one on your local machine
with -p <machine>:<container>
so that it can be reached from outside (i.e., the Internet).
Alternatively, you may run the service inside the Docker network host
which automatically
publishes the service on a host port, matching the internal port. In that case, you need to
override the default value for server.port
to change where your Qanary pipeline
will be available inside this network!
Example:
- the pipeline is listening on port
8080
(default, as specified by propertyserver.port
) inside the docker container, - it needs to be available on port
8000
on the Internet - you have two options:
- map the internal port
8080
to your server port8000
withdocker run -p 8000:8080 qanary-pipeline:latest
(recommended for production) OR - change the
server.port
to8000
and run the pipeline in thehost
network withdocker run -e SERVER_PORT=8000 --net host qanary-pipeline:latest
- map the internal port
Note: Using the host
network is only encouraged if all services (i.e. pipeline, components,
triplestore) can be started in this mode as well (see section below)!
Host configuration
In a production environment, it might not be possible to start all components and the pipeline in
one host
network. In such a case, http://localhost
is not an option for networking.
Here, the property server.host
needs to reflect the actual host where the Qanary pipeline
service is running, so that the correct address of the pipeline can be communicated to
external services if required (for example, when loading local resources with SPARQL queries).
docker run -e SERVER_HOST=http://example.pipeline example-pipeline:latest
For more information about networking between Qanary pipeline and components please see guide: How-do-I-configure-a-Qanary-component-using-Docker?
The configurations shown above using the standard docker run
command can easily be applied in a docker-compose.yml
file.
To start an instance of the latest Qanary pipeline that is listening on http://example.pipeline:8000
and is connecting to a triplestore endpoint at http://example.triplestore:5820
the configuration could look like this:
version: "3.5"
services:
pipeline:
image: qanary/qanary-pipeline:latest
environment:
- "SERVER_HOST=http://example.pipeline"
- "STARDOG_URL=http://example.triplestore:5820/"
- "STARDOG_USERNAME=admin"
- "STARDOG_PASSWORD=admin"
- "QANARY_QUESTIONS_DIRECTORY=/qanary-questions"
ports:
- "8000:8080"
volumes:
- /data/qanary-questions:/qanary-questions
Note: If the pipeline, components and the triplestore are all available on the same
host
network, you might define a pipeline similar to this example:
version: "3.5"
services:
pipeline:
image: qanary/qanary-pipeline:latest
environment:
- "SERVER_PORT=8000"
- "STARDOG_URL=http://localhost:5820/"
- "STARDOG_USERNAME=admin"
- "STARDOG_PASSWORD=admin"
- "QANARY_QUESTIONS_DIRECTORY=/qanary-questions"
network_mode: host
volumes:
- /data/qanary-questions:/qanary-questions
The following command can be used to start a Qanary system from a Docker image without a docker-compose.yml
(here: using the standard configuration values of a locally installed Stardog triplestore):
docker run \
-e STARDOG_URL=http://localhost:5820 \
-e STARDOG_USERNAME=admin \
-e STARDOG_PASSWORD=admin \
--net=host \
qanary/qanary-pipeline:latest
In this tutorial, you have learned how to run the Qanary Pipeline template as Docker container.
-
How to establish a Docker-based Qanary Question Answering system
-
How to implement a new Qanary component
... using Java?
... using Python (Qanary Helpers)?
... using Python (plain Flask service)?