-
Notifications
You must be signed in to change notification settings - Fork 23
read_solid
Martin Asser Hansen edited this page Oct 2, 2015
·
5 revisions
read_solid reads and converts SOLID entries in color space to nucleotide space.
SOLID format consits of files with lines like the below one:
1379_8_1161_F3 T30010310310130010022122330001000010 6,13,27,12,15,6,12,9,4,24,25,14,15,22,4,18,27,19,13,4,4,5,6,12,11,24,8,9,19,24,12,27,4,20,14,
The resulting Biopiece record looks like this:
REC_TYPE: SOLID
SCORE_MEAN: 13.94
SEQ: taAaCcgttACcATtTGGgagtctaTttTGgGgGtt
SEQ_CS: T30010310310130010022122330001000010
SEQ_LEN: 36
SEQ_NAME: 1379_8_1161_F3
SEQ_QUAL: 6;13;27;12;15;6;12;9;4;24;25;14;15;22;4;18;27;19;13;4;4;5;6;12;11;24;8;9;19;24;12;27;4;20;14
---
The keys explained:
- SCORE_MEAN - The mean of all the quality scores in SEQ_QUAL.
- SEQ - The sequence in nucleotide space.
- SEQ_CS - The sequence in color space.
- SEQ_LEN - Sequence length.
- SEQ_NAME - Sequence length.
- SEQ_QUAL - List of quality scores.
For more about the SOLID:
http://solid.appliedbiosystems.com/
read_solid [options] -i <SOLID file(s)>
[-? | --help] # Print full usage description.
[-i <files!> | --data_in=<files!>] # Comma separated list of files or glob expression to read.
[-n <uint> | --num=<uint>] # Limit number of records to read.
[-q <uint> | --quality=<uint>] # Lowercase nucleotide with quality score below this limit (min:0 max:40) - Default=20
[-I <file!> | --stream_in=<file!>] # Read input stream from file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output stream to file - Default=STDOUT
[-v | --verbose] # Verbose output.
read_solid -i test.solid # Read SOLID entries from file.
read_solid -i test1.solid,test2.solid # Read SOLID entries from files.
read_solid -i '*.solid' # Read SOLID entries from files.
read_solid -i test.solid -n 10 # Read first 10 SOLID entries from file.
read_solid -i test.solid -q 10 # Change quality score threshold to 10.
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2007
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
read_solid is part of the Biopieces framework.