Skip to content

Test on RNAPII ChIA Drop data

Minji Kim edited this page Oct 10, 2019 · 7 revisions

Download

  1. Download GSM3347525NRT.txt.gz file from this link to your ./miasig/data/ folder.
  2. Decompress the file by typing:
    $ gunzip GSM3347525NRT.txt.gz

This file was downloaded from the GEO site (GSE109355), and filtered to only keep data in 'non-repetitive regions' of chr2L.

Generate fragment coverage file

We first generate a bedgraph file from RNAPII ChIA-Drop fragments.

$ pwd
/projects/kimm/miasig/data
$ source miasig_conf.sh
$ bash ${miasig_dir}gems2cov.sh --conf miasig_conf.sh --dir ${data_dir} --i GSM3347525NRT.txt --lib GSM3347525NRT --r dm3 --bs 10

MIA-Sig should have created the following 10 files in a directory ${data_dir}:

  • GSM3347525NRT.bedgraph
  • GSM3347525NRT_binned_10bp.bedgraph
  • GSM3347525NRT_binned_10bp_chr2L.bedgraph
  • GSM3347525NRT_binned_10bp_chr2R.bedgraph
  • GSM3347525NRT_binned_10bp_chr3L.bedgraph
  • GSM3347525NRT_binned_10bp_chr3R.bedgraph
  • GSM3347525NRT_binned_10bp_chr4.bedgraph
  • GSM3347525NRT_binned_10bp_chrX.bedgraph
  • GSM3347525NRT.frags.bed
  • GSM3347525NRT.frags.sorted.bed

Run 'Enrichment test'

$ pwd
/projects/kimm/miasig/data
$ source miasig_conf.sh
$ for chrom in $(cut -f1 ${data_dir}dm3.chrom.sizes); do qsub -F "--conf miasig_conf.sh --lib GSM3347525NRT --gen dm3 --fdr 0.1 --chr ${chrom} --sz 1000 --bs 10 --dir ${data_dir} --file GSM3347525NRT.txt" ${miasig_dir}freq_enrich_sigtest.pbs; done

After < 1 hour, MIA-Sig generates:

  • GSM3347525NRT_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_logFile.txt
  • GSM3347525NRT_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_master.txt

(optional) Generate .hic files

To visualize results on Juicebox, create .hic files on 'PASS', 'FAIL', 'ALL' categories from the 'master' file.

$ pwd
/projects/kimm/miasig/data
$ source miasig_conf.sh

$ declare -a StringArray=("PASS" "FAIL" "ALL")
$ for val in ${StringArray[@]}; do qsub -F "--conf miasig_conf.sh --dir ${data_dir}GSM3347525NRT_EnrichTest_FDR_0.1/ --fn GSM3347525NRT_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_master.txt --cat ${val} --meth allpairs --bedpe True --gf ${data_dir}dm3.chrom.sizes" ${miasig_dir}master2hic.pbs; done

We now have 3 .hic files:

  • GSM3347525NRT_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_ALL_allpairs.hic
  • GSM3347525NRT_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_FAIL_allpairs.hic
  • GSM3347525NRT_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_PASS_allpairs.hic

(optional) Generate useful plot

To characterize the data and algorithms further, we have a script to generate some plots after running the enrichment test.

We tested the script with python v3.6.6 with packages numpy, matplotlib.pyplot.

$ source miasig_conf.sh

$ python plot_enrich_test_results.py GSM3347525NR_chr2L ${data_dir}GSM3347525NR_EnrichTest_FDR_0.1/ GSM3347525NR_chr2L_FDR_0.1_pseudoGEM_1000_enrichTest_null.txt GSM3347525NR_chr2L_FDR_0.1_pseudoGEM_5000_enrichTest_master.txt

The script generates 2 .pdf files:

  • GSM3347525NR_chr2L_enrichment_score_null_obs_ecdf.pdf (Supp. Figure S8c)
  • GSM3347525NR_chr2L_enrichment_score_null_obs_hist_max100.pdf (Supp. Figure S8d)

Conclusion

If any parts are unclear or did not run as expected, report a bug on the Issues page.