Setting up the asterales system

Source and target

There are two computers (or vhosts or VMs) involved in using the deployment system to set up an Open Tree instance. The source would typically be your favorite local or personal computer. The target is the computer that is going to run Open Tree.

All of the commands given below run on the source computer.

It is possible for the source and target computers to be the same, but this has not been tested.

Before you start

You will need Java, Python, gnu make, and bash on your local machine.

Create or choose a directory in which to work, and cd to that directory. This directory will contain git repository clones as well as intermediate files such as database dumps.

Prepare ot-base and jade

git clone git@github.com:FePhyFoFum/jade.git 
(cd jade; ./mvn_install.sh)
git clone git@github.com:OpenTreeOfLife/ot-base.git
(cd ot-base; ./mvn_install.sh)

Prepare a subset taxonomy

You can get an Asterales-only taxonomy from http://files.opentreeoflife.org/ott/aster.tgz. I don't recommend building your own taxonomy, since it involves downloading the NCBI and GBIF taxonomies, but if you insist:

git clone git@github.com:OpenTreeOfLife/reference-taxonomy.git
export TARDIR=$PWD
(cd reference-taxonomy; make aster-tarball)

Prepare a synthetic tree database

Build treemachine:

git clone git@github.com:OpenTreeOfLife/treemachine.git
(cd treemachine; ./mvn_cmdline.sh)

Clone the gcmdr repo and put the Asterales taxonomy in it (if you aren't using the one that's already there):

tar xzf aster.tgz
git clone git@github.com:OpenTreeOfLife/gcmdr.git
pushd gcmdr
cp -p ../aster/* example/aster/

Build the treemachine database:

cp -p ../treemachine/target/*.jar ./
python run_asterales_example.py 
tar -C example/asterales_synth.db -czf ../treemachine.db.tgz .
popd

Prepare a taxomachine database

First, set up repo and define a convenient abbreviation:

git clone git@github.com:OpenTreeOfLife/taxomachine.git
(cd taxomachine; ./mvn_cmdline.sh)
alias taxo='java -Xmx10g -XX:-UseConcMarkSweepGC -jar taxomachine/target/taxomachine-0.0.1-SNAPSHOT-jar-with-dependencies.jar'

Create the taxomachine database: (cf. the taxomachine README)

taxo loadtaxsyn ott aster/taxonomy.tsv aster/synonyms.tsv taxomachine.db
taxo makecontexts taxomachine.db
taxo makegenusindexes taxomachine.db
tar -C taxomachine.db -czf taxomachine.db.tgz .

I don't know how much memory is needed to build the Asterales database, but it's probably not 5G. You could try it with less; let me (JAR) know how that goes. But note that if you're doing a full Open Tree synthesis, you'll want a lot more than 10G.

Prepare a subset phylesystem

You can skip this step, because there's a suitable phylesystem on github already.

If an Asterales phylesystem repository doesn't exist yet (e.g. here), go to github and create one. Call it something like asterales-phylesystem.

The list of studies that belong in this phylesystem is here.

A script for creating the repo is here.

Move the contents of the directory that this script creates into a local github working directory and do 'git push' to get it to github. (Or come up with some other way of setting it up; depending on github seems unfortunate but doing anything else will be more work.)

Choose or provision the target machine

If you're an open tree developer, currently ot17.opentreeoflife.org is reserved for this purpose. But you can any GNU/Linux installation for this purpose.

(The entire system ought to work on a VM, but that hasn't been attempted yet.)

The deployment scripts assumes that there is an administrative user with sudo privileges, and when run they will create an 'opentree' user to run all services.

It is assumed that opentree is not sharing the target machine with anything else. In particular it takes over the apache daemon.

Copy the configuration file for asterales.opentreeoflife.org to a local file, say my.config. Do not use it directly, but rather as a template.

curl -o my.config https://raw.githubusercontent.com/OpenTreeOfLife/deployed-systems/master/asterales/ot17.config

Edit host names and so on as needed (you will probably setting up somewhere other than ot17).
Set the phylesystem repo as specified as above (asterales-phylesystem or whatever you chose).
Set OPENTREE_ADMIN to be the admin user on the target (this would be 'admin' on Debian on AWS or 'ubuntu' on Ubuntu on AWS).
Set OPENTREE_IDENTITY to be the pathname on the source (local) computer for the ssh private key for the admin user on the target computer.

See the deployment system documentation and configuration file documentation for detailed configuration instructions.

You don't need to worry about client ids, secrets, certificates, and so on unless you want to enable study creation and editing in the curation app.

Deploy software (webapps, neo4j, plugins, etc.) and phylesystem clone

git clone git@github.com:OpenTreeOfLife/opentree.git
cd opentree/deploy/
alias deploy='./push.sh -c ../../my.config'

Now test to see if you configuration file works (this may install software on the target):

deploy echo x

If that failed because a password was demanded for sudo, log in to the admin account on the target and do the following:

bash as-admin.sh

You may be prompted to enter your password. Now (back on the source machine) install all of the open tree software:

deploy

Deploy treemachine and taxomachine neo4j databases

deploy push-db ../../treemachine.db.tgz treemachine
deploy push-db ../../taxomachine.db.tgz taxomachine

Initialize oti database

deploy install-db downloads/taxomachine.db.tgz oti

Index the studies (OTI)

deploy index

Test

See files in this repo (germinator).

(Optional) Set up github authorizations

See deploy/sample.config in the opentree repo if you want to be able to use the curator application to save to github or to use the tree browser to add comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly