Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not completed in overlap_collect step #425

Open
kjyunm opened this issue Nov 10, 2024 · 1 comment
Open

Not completed in overlap_collect step #425

kjyunm opened this issue Nov 10, 2024 · 1 comment

Comments

@kjyunm
Copy link

kjyunm commented Nov 10, 2024

Hello,
I am trying to build a pangenome using pig data from multiple breeds. The input data is 4.5GB when compressed, and I am using the Docker image ghcr.io/pangenome/pggb:latest.

I ran the following command
docker run -it -v ${PWD}:/data -u $(id -u):$(id -g) ghcr.io/pangenome/pggb:latest pggb -i /data/pigs.assembly.fa.gz -o /data/out -t 20

However, during the execution, it gets stuck at the step
[seqwish::transclosure] 25082.955 81.34% 12280027975-12290027975 overlap_collect
and has not progressed for over a week.

Could you please advise if there are any additional options or preprocessing steps needed to resolve this issue? I would appreciate any suggestions on the cause of this hanging and how to resolve it.

Thank you!

@AndreaGuarracino
Copy link
Member

@kjyunm maybe you caught a seqwish bug? You could try to increase -k/--min-match-len in pggb (23 by default) to make seqwish's life a bit easier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants