Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-implement the code that read input files in parallel. #230

Open
sebhtml opened this issue Feb 20, 2014 · 0 comments
Open

Re-implement the code that read input files in parallel. #230

sebhtml opened this issue Feb 20, 2014 · 0 comments

Comments

@sebhtml
Copy link
Owner

sebhtml commented Feb 20, 2014

The code that counts sequences in file (Partitioner) is fine.

But after that, the code that reads sequences from file is not very good.

The problem is that too many processes are reading the same file at once.

The code can't really use MPI I/O for that directly because (I think) because MPI I/O functions are collectives.

One thing that would great would be:

Have just 1 process that takes care of one file and dispatch the sequences to other ranks / actors.

code/SequencesLoader/SequencesLoader.cpp

metadata has to be sent too (LEFT_READ, RIGHT_READ, PAIR MATE and so on).

This is not trivial because code/SequencesLoader/SequencesLoader.cpp is
quite ugly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant