This repository contains all of the scripts for the automation of the MOH projects and integration of genpipes
Activates the globus endpoints for Create_readset_transfer_BAMS.sh
Must be updated for individual users
Usage: Activate_globus.sh
Takes the full path of the read folder as an input and transfers the BAMs/fastqs with the
MoH prefix.In addition it tranfers over the key run processing metrics and generates
a log file forlater usage with the database
Globus fields must be updated for individual users
Usage: Create_readset_transfer_BAMS.sh PATH_TO_RUN_FOLDER
This script takes a single file and parses it to update the database with metrics from
run processing and from both DNA and RNA Genpipes. It compares the extracted values to
known acceptable values and adds them to the KEY_METRICS table within the MoH database.
Usage: Metrics_Update.py
Parses the file structure of MAIN and looks for the output files produced from samples in
the Sample table. It then updates STATUS table with any progress. It queries for all files,
so any deliverables that are removed will result in an incomplete listing. In addition this
script populates/updates the Timestamps and File_Locations tables.
Usage: MOH_Check_Progress.py
This contains all of the functions necessary for interacting with a file. Required for other scripts.
Takes a single sample input and hard links all of the deliverable files within a structured
directory with their final names. In addition it constructs the custom readme and log files.
All data is taken from the database.
Usage: MOH_ln_output.py Sample
Currently has a coverage cutoff implemented
Dumps the tables within the database as csv's within the CSV folder of DATABASE.
Usage: Create_CSVs.sh
Searches the temporary raw_reads folder for matching DNA pairs or RNA. It moves the
fastqs/BAMs and creates the readset and pairs files in preparation for Setup_run_XXX.sh.
In addition, it populates the Samples table. It will not move process any files that are
outside the naming convention and it will exit if it finds files with the same name at the
destination.
Usage Generate_pairs_readset.py DNA
or Generate_pairs_readset.py RNA