-
Notifications
You must be signed in to change notification settings - Fork 5
Usage: get_gens_dfs.py
Michael Olvera edited this page May 3, 2018
·
6 revisions
get_gens_dfs.py generates a table (tsv file) listing all variants in a defined interval for a specified individual (based on input VCF file). This basically reformats genotypes from VCF for easier processing later when designing sgRNAs. Written in Python v 3.6.1. Kathleen Keough et al 2017-2018.
get_gens_dfs.py <vcf_file> <locus> <out> [-f][--bed] [—chrom]
python3 get_gens_dfs.py
INPUT.vcf.gz\
1:11980181-12013515\
OUT_GENS
python3 get_gens_file.py
INPUT.vcf.gz\
loci.bed\
OUT_multi_loci_gens\
--bed
where the loci.bed
file is formated like so:
1 11976269 12018380 MFN2
7 76298036 76308038 HSPB1
11 61940001 61963675 BEST1
Arguments: | Details |
---|---|
vcf_file |
BCF/VCF file with genotypes. Files should be gzipped (using bcftools or bgzip ) and include an index (using bcftools or tabix ). |
locus |
Locus from which to pull variants, in format chromosome:start-stop, or a BED file if --bed. |
out |
The name for the output file and directory in which to save the output files. |
Options: | Details |
---|---|
-f |
If this option is specified, keeps homozygous variants in output file. |
--bed |
Indicates that a BED file is being used in place of a locus. BED files are expected to include the CHROM, START, STOP, and ID column. |
--chrom |
Run on entire chromosome. |
AlleleAnalyzer. Keough et al. 2019, Genome Biology.