-
Notifications
You must be signed in to change notification settings - Fork 20
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #22 from martinghunt/sequence_trim
Sequence trim
- Loading branch information
Showing
8 changed files
with
121 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
>1/1 | ||
TRIM1GCTCGAGCT | ||
>2/1 | ||
TRIM1AGCTAGCTAG | ||
>3/1 | ||
CGCTAGCTAG | ||
>4/1 | ||
TRIM2AGCTAGCTAG | ||
>5/1 | ||
AGCTAGCTAG | ||
>6/1 | ||
TRIM4AGCTAGCTAG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
>3/1 | ||
CGCTAGCTAG | ||
>4/1 | ||
AGCTAGCTAG | ||
>5/1 | ||
AGCTAGCTAG | ||
>6/1 | ||
AGCTAGCTAG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
>1/2 | ||
TRIM1ACGTACGTAC | ||
>2/2 | ||
TRIM2ACGTAGTGA | ||
>3/2 | ||
ACGCTGCAGTCAGTCAGTAT | ||
>4/2 | ||
TRIM3CGATCGATCG | ||
>5/2 | ||
TRIM3CGATCGATCG | ||
>6/2 | ||
CGATCGATCG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
>3/2 | ||
ACGCTGCAGTCAGTCAGTAT | ||
>4/2 | ||
CGATCGATCG | ||
>5/2 | ||
CGATCGATCG | ||
>6/2 | ||
CGATCGATCG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
>1 | ||
TRIM1 | ||
>2 | ||
TRIM2 | ||
>3 | ||
TRIM3 | ||
>4 | ||
TRIM4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#!/usr/bin/env python3 | ||
|
||
import argparse | ||
from fastaq import tasks | ||
|
||
parser = argparse.ArgumentParser( | ||
description = 'Trims sequences off the start of all sequences in a pair of fasta/q files, whenever there is a perfect match. Only keeps a read pair if both reads of the pair are at least a minimum length after any trimming', | ||
usage = '%(prog)s [options] <fasta/q 1 in> <fastaq/2 in> <out 1> <out 2> <trim_seqs>') | ||
parser.add_argument('--min_length', type=int, help='Minimum length of output sequences [%(default)s]', default=50, metavar='INT') | ||
parser.add_argument('infile_1', help='Name of forward fasta/q file to be trimmed', metavar='fasta/q 1 in') | ||
parser.add_argument('infile_2', help='Name of reverse fasta/q file to be trimmed', metavar='fasta/q 2 in') | ||
parser.add_argument('outfile_1', help='Name of output forward fasta/q file', metavar='out_1') | ||
parser.add_argument('outfile_2', help='Name of output reverse fasta/q file', metavar='out_2') | ||
parser.add_argument('trim_seqs', help='Name of fasta/q file of sequences to search for at the start of each input sequence', metavar='trim_seqs') | ||
options = parser.parse_args() | ||
tasks.sequence_trim( | ||
options.infile_1, | ||
options.infile_2, | ||
options.outfile_1, | ||
options.outfile_2, | ||
options.trim_seqs, | ||
min_length=options.min_length | ||
) |