To use FireWorks, your computer (to run the lpad
command) and
the Firetask worker nodes need to access a MongoDB
database holding the workflow information.
The team has a MongoDB server in the Allen Center project on Google Cloud. It's useful for running FireWorks in Google Cloud and on our local computers but we run that server only when needed. (See How to run the Whole Cell Model on the Google Cloud Platform for details.) You can access the service securely via an ssh tunnel with your Google login. Creating a database there just takes the steps in the next section.
That MongoDB server in Google Cloud is not accessible to the open Intenet or
to Sherlock. To run workflows on Sherlock or elsewhere, you can create an
account and database on
MongoDB Atlas.
See the instructions in the
MAT project forum
to set it up including using the lpad init -u
command to create a
my_launchpad.yaml
that configures FireWorks to connect to it.
CAUTION: Your Atlas database will be reachable to the entire Internet. So use a password vault to generate and save long, random passwords for the account. Never reuse passwords.
Fireworks needs a launchpad config file, my_launchpad.yaml
.
When running with a job queue via qlaunch
(such as with SLURM on Sherlock, not
with worker nodes on Google Compute Engine),
it also needs a queue-adapter file, my_qadapter.yaml
.
-
Create the
my_launchpad.yaml
file.-
Atlas MongoDB cluster: See the MAT project forum on how to use the
lpad init -u
command to create amy_launchpad.yaml
looking something like this (after thehost
setting, are the rest defaults that could be omitted?):authsource: admin host: mongodb+srv://<atlas-username>:<atlas-password>@<atlas-cluster>/wcEcoli?retryWrites=true&w=majority logdir: null mongoclient_kwargs: {} name: null password: null port: null ssl: false ssl_ca_certs: null ssl_certfile: null ssl_keyfile: null ssl_pem_passphrase: null strm_lvl: INFO uri_mode: true user_indices: [] username: null wf_user_indices: []
(The tool
wholecell/fireworks/initialize.py
could become useful again if updated.)
-
-
Google Cloud MongoDB database: Run the script
runscripts/cloud/mongo-ssh.sh
to open an ssh tunnel whenever you want to access the server. (You can leave it running in one terminal tab then^C
it when you're done, or runrunscripts/cloud/mongo-ssh.sh bg
to run it in a background process and kill it when you're done.)Construct
my_launchpad.yaml
like this:host: localhost port: 27017 name: <your-username-or-another-unique-name-for-your-database> username: null password: null
These values ask FireWorks to connect to
localhost:27017
, which themongo-ssh.sh
tunnel will forward to port27017
on the Google Compute Engine MongoDB server.name
will be your database name.We're not using user authentication on this server since the ssh tunnel already requires Google login.
-
Run
lpad reset
This will connect to the MongoDB server then prompt you to type
Y
es to confirm creating and initializing the database. If it fails to connect, check that themongo-ssh.sh
tunnel is still running.If you want additional launchpad databases, create a launchpad yaml file for each one. To use a launchpad config filename besides the default
my_launchpad.yaml
, pass it as a command option likelpad -l gce_launchpad.yaml webgui
. -
If you need a queue-adapter to run FireWorks workers on Sherlock, write
my_qadapter.yaml
from the templatewholecell/fireworks/templates/my_qadapter.yaml
. Or you can run the interactive scriptpython -m wholecell.fireworks.initialize
, but beware that it overwritesmy_launchpad.yaml
andmy_qadapter.yaml
.
The launchpad database keeps the status of current and past workflows.
You can rerun past workflows without re-uploading them, archive or delete
workflows, and mix multiple workflows in the same database, but the simplest
approach is to lpad reset
it and upload a new workflow each time.