-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
output of pangenome #4
Comments
Hi, the output files resulted from pan-genome construction based on WGA: pan-genome.wga.conensus.stats # for the pan-genome size pan-genome.wga.core.stats # for the core-genome size pan-genome.wga.newseq.stats # for the new sequence size Since we have eight genomes, the result file contains eight lines. Each line represents the pan-genome or core-genome or new sequence size under different number of input genomes (from 1 to 8 genomes). Each line is tab-separated, each number is the pan-genome or core-genome or new sequence size calculated under different combinations of genomes. the files with prefix tmp are just the intermediate files. |
I 'm not understand you clearly. the picture is my result . there are only three samples in my project. what is the columns means ? It means sizes or regions ? and can I get a final pangenome of the the three samples.
…------------------ 原始邮件 ------------------
发件人: "Wen-Biao Jiao"<notifications@github.com>;
发送时间: 2020年8月31日(星期一) 下午3:59
收件人: "schneebergerlab/AMPRIL-genomes"<AMPRIL-genomes@noreply.github.com>;
抄送: "fangying"<365698105@qq.com>; "Author"<author@noreply.github.com>;
主题: Re: [schneebergerlab/AMPRIL-genomes] output of pangenome (#4)
Hi,
the output files resulted from pan-genome construction based on WGA:
pan-genome.wga.conensus.stats # for the pan-genome size
pan-genome.wga.core.stats # for the core-genome size
pan-genome.wga.newseq.stats # for the new sequence size
Since we have eight genomes, the result file contains eight lines.
Each line represents the pan-genome or core-genome or new sequence size under different number of input genomes (from 1 to 8 genomes).
Each line is tab-separated, each number is the pan-genome or core-genome or new sequence size calculated under different combinations of genomes.
the files with prefix tmp are just the intermediate files.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
For your case including three genomes, the last line of the file pan-genome.wga.conensus.stats should contains three numbers. Because each genome can be selected as the reference and other genomes will be compared to the reference. Each number of the last line is the pan-genome size (total length of non-redundant sequences) when three genomes are included. If you just expect one number for the pan-genome including all your genomes, just select one of them. |
take pan-genome.wga.conensus.stats for example .(1)why there are three columns in the first line? and what are each columns means ? (2)why there are six columns in the second line but three columns in the third line? (3)the file only gives the size of pangenome, and don't show me regions that pangenome comes from. so I can't select and merge sequences from the original three genome to get a final pangenome fasta file. Is that right ? thanks for your patient answer.
…------------------ 原始邮件 ------------------
发件人: "Wen-Biao Jiao"<notifications@github.com>;
发送时间: 2020年8月31日(星期一) 下午4:42
收件人: "schneebergerlab/AMPRIL-genomes"<AMPRIL-genomes@noreply.github.com>;
抄送: "fangying"<365698105@qq.com>; "Author"<author@noreply.github.com>;
主题: Re: [schneebergerlab/AMPRIL-genomes] output of pangenome (#4)
For your case including three genomes, the last line of the file pan-genome.wga.conensus.stats should contains three numbers. Because each genome can be selected as the reference and other genomes will be compared to the reference. Each number of the last line is the pan-genome size (total length of non-redundant sequences) when three genomes are included.
If you just expect one number for the pan-genome including all your genomes, just select one of them.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Again, since you have three genomes, each genome can be selected as the reference. The first line means if you just select one genome to build pan-genome, you will have three (any one of your three genomes). All columns except the first column indicate the pan-genome size. The second line means you select two genomes from your three genomes, you have 2*3 combinations (again all genomes can be the reference), Yes, this only gives the size of pan-genome because these scripts are just used for the project that we recently did, not for a general method of pan-genome building. |
Hello! I get the result of pangenome. but I don't know the meaning of the file. like pan-genome.wga.conensus.stats,pan-genome.wga.core.stats, pan-genome.wga.newseq.stats and files in tmp. like An-1.com-2.wga.bed, tmp.An-1.C24.bed and so on .
The text was updated successfully, but these errors were encountered: