monopogen_pipeline

This is a local implementation of the Monopogen analysis package, originally developed and maintained by Ken chen's lab.

These scripts were developed around the idea of using sample batches, each of which is associated with an alphanumeric tag (e.g., 00, 01, 02). They are set up for LSF batch queues with access to global scratch and should be portable between users. The directory structure is based partly on third-party expectations and, of course, on design choices, which may not be optimal for everyone.

Setup

Clone this repository and customize the user configuration in the config.ini file. ${TOPDIR}/ is where software, reference, and data directories live. Assemble a list of bams into a CSV file (e.g., batch.test1) in the format
```
<unique_bam_tag_1>,<bam_path_1>
<unique_bam_tag_2>,<bam_path_2>
...
```
Place this file at the top level in this cloned repo.

The above bam paths need to be within the scope of ${LSF_DOCKER_VOLUMES} in config.ini. Modify ${EXT_STUDY_1} to be a meaningful parent directory containing the bams and adjust ${LSF_DOCKER_VOLUMES} accordingly.
Install Monopogen into ${TOPDIR}/software/Monopogen/. We opted to place the reference files into ${TOPDIR}/reference/, namely:
- ${TOPDIR}/reference/1KG3_imputation_panel/
- ${TOPDIR}/reference/GRCh38.d1.vd1/
For LSF use, create and/or modify the job group under which the scripts will be run. (Here, we use /${USER}/${LABNAME}, where ${USER} and ${LABNAME} are set in config.ini.)

Germline pipeline

Raw calling. Run each step sequentially. Note that move may be run concurrently with preprocess, run, or merge, but a final move command should be issued.

BATCH=test1
./1.prepare.sh  ${BATCH}  preprocess
./1.prepare.sh  ${BATCH}  move
./1.prepare.sh  ${BATCH}  tidy

./2.germline.sh  ${BATCH}  setup
./2.germline.sh  ${BATCH}  run
./2.germline.sh  ${BATCH}  merge
./2.germline.sh  ${BATCH}  move
./2.germline.sh  ${BATCH}  tidy

The final raw calls are in ${TOPDIR}/samples/samples.${BATCH}/${SAMPLE}/germline/merged/${SAMPLE}.phased.sorted.vcf.gz.

Annotation.
Cell type mapping.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
1.prepare.sh		1.prepare.sh
2.germline.sh		2.germline.sh
LICENSE		LICENSE
README.md		README.md
config.ini		config.ini
gl_merge_phased.sh		gl_merge_phased.sh
gl_run.sh		gl_run.sh
gl_setup.sh		gl_setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

monopogen_pipeline

Setup

Germline pipeline

About

Releases

Packages

Languages

License

ding-lab/monopogen_pipeline

Folders and files

Latest commit

History

Repository files navigation

monopogen_pipeline

Setup

Germline pipeline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages