Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about barcode length in mock2 and mock6 #77

Open
MathieuCharles opened this issue Jul 26, 2017 · 1 comment
Open

Question about barcode length in mock2 and mock6 #77

MathieuCharles opened this issue Jul 26, 2017 · 1 comment

Comments

@MathieuCharles
Copy link

Hello,

Thanks for explanations in #76.

one more question about mock 2 and 6.

The barcode indicated in mock 2 and 6 are respectively of length 12 and 6, but in their mock-index-reads.fastq, index are respectively of length 13 and 7.

exemple of mock2:
barcode in sample_metadata.tsv is : ATCTGCCTGGAA
If I search perfect match in the index fastq file I foun:

     23 AATCTGCCTGGAA
 243167 ATCTGCCTGGAAA
    446 ATCTGCCTGGAAC
     17 ATCTGCCTGGAAG
      1 ATCTGCCTGGAAN
    681 ATCTGCCTGGAAT
     62 TATCTGCCTGGAA

Is the correct barcode ATCTGCCTGGAAA (with a A at the end) ?
What is your advice?

Same problem with mock6

      ACCTGT          ACCTCG          ACCGCA
    951 AACCTGT	    195 AACCTCG	     58 AACCGCA
   2212 ACCTGTA	 210433 ACCTCGA	   1245 ACCGCAA
 277218 ACCTGTC	  36791 ACCTCGC	   5589 ACCGCAC
   4911 ACCTGTG	   1878 ACCTCGG	 312775 ACCGCAG
   1399 ACCTGTT	   5707 ACCTCGT	   2041 ACCGCAT
     24 CACCTGT	     16 CACCTCG	      1 GACCGCA
      1 GACCTGT	     10 GACCTCG	     90 TACCGCA
     46 TACCTGT	      1 NACCTCG	
                   1092 TACCTCG

Many thanks for these datasets!

@nbokulich
Copy link
Contributor

@MathieuCharles thanks for finding this issue! I have not noticed this previously as, evidently, it does not impact the ability of qiime to demultiplex these data.

I am still trying to figure out why the barcode files are 1 nt longer than the sample-metadata (all data are provided by contributors so may take time to track down) but for now it seems like a reasonable assumption that the most common match (which in most cases appears to be more than 100-fold more common than other matches) is the correct one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants