Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To-do list- data cleaning step #7

Open
1 of 2 tasks
gauravsk opened this issue Nov 3, 2017 · 7 comments
Open
1 of 2 tasks

To-do list- data cleaning step #7

gauravsk opened this issue Nov 3, 2017 · 7 comments

Comments

@gauravsk
Copy link
Collaborator

gauravsk commented Nov 3, 2017

Data cleaning

  • check if user has periods in file names; break script if so and ask user to rename files
  • rewrite the perl script into python; make it put all of the cleaned output files into a new folder
@limey-bean
Copy link
Owner

that would be rad!

@gauravsk gauravsk changed the title Summary stats from data cleaning steps? To-do list- please add tasks as new posts! Dec 6, 2017
@gauravsk gauravsk changed the title To-do list- please add tasks as new posts! To-do list- data cleaning step Dec 6, 2017
@gauravsk
Copy link
Collaborator Author

gauravsk commented Dec 8, 2017

This may help with converting check_paired.pl to python: https://github.com/enormandeau/Scripts/blob/master/fastqCombinePairedEnd.py

@limey-bean
Copy link
Owner

I like it, the only issue is that it makes a single unpaired reads file. Would it be hard to make two unpaired reads files (F and R)?

@gauravsk
Copy link
Collaborator Author

gauravsk commented Dec 8, 2017

Oh, yeah, didn't mean to suggest that as a final version, more a note to self for me to look at how this script does it when I write one to our needs.

@limey-bean
Copy link
Owner

Cool cool, It is a good template.

@gauravsk
Copy link
Collaborator Author

gauravsk commented Feb 23, 2018

  • The QC folder is emptied out by the end of the full run-- need to figure out where, exactly, and maybe also remove the directory itself (right now the directory is kept, but it is empty). Alternatively, maybe add an option to allow users to retain the QC'd stuff if they want to keep it.

@gauravsk
Copy link
Collaborator Author

  • Get rid of the line that generates the error QC/cutadapt_fastq/primer_sort/*_*_Paired_1.fastq not found in every run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants