Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design genome analysis workflow #3

Open
mafeeney opened this issue May 6, 2022 · 1 comment
Open

design genome analysis workflow #3

mafeeney opened this issue May 6, 2022 · 1 comment

Comments

@mafeeney
Copy link
Collaborator

mafeeney commented May 6, 2022

something, MUMmer, bedtools, primer design

@widdowquinn
Copy link
Collaborator

Assuming we start with a set of genomes that can be divided into pathogens and non-pathogens, we need to:

  1. divide the genomes into the correct groups (e.g. using MLST/other markers, maybe presence/absence of effectors/toxins) in galaxy
  2. perform pairwise genome comparisons of pathogens against each other, and pathogens against non-pathogens (or even a single pathogen genome against all non-pathogen genomes, because of the set arithmetic), with mummer in galaxy
  3. use BEDtools or similar to identify regions common to all pathogens (intersection of regions aligning to a reference pathogen genome, common to all other pathogen genomes) in galaxy
  4. use BEDtools or similar to identify regions common to all pathogens, but also present in at least one non-pathogen (these will be discarded as they are not diagnostic of the pathogens) in galaxy
  5. use a primer design tool to design primers to the reference pathogen genome, and keep only those that amplify a region unique to/diagnostic of pathogens (galaxy)
  6. test the designed primers in silico to ensure they amplify all the known pathogen genomes (galaxy)
  7. test the designed primers in silico to ensure they do not amplify any known non-pathogens (galaxy)

The remaining primer sets after this process are candidate diagnostic primers that positively amplify pathogens, but not non-pathogens. We can then…

  1. test the candidate primers against the RefSeq genome database at NCBI to ensure there is no wider off-target amplification (NCBI)

The last step might be a stretch goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants