output of pangenome #4

liufy11 · 2020-08-31T07:53:43Z

Hello! I get the result of pangenome. but I don't know the meaning of the file. like pan-genome.wga.conensus.stats，pan-genome.wga.core.stats， pan-genome.wga.newseq.stats and files in tmp. like An-1.com-2.wga.bed, tmp.An-1.C24.bed and so on .

wen-biao · 2020-08-31T07:59:16Z

Hi,

the output files resulted from pan-genome construction based on WGA:

pan-genome.wga.conensus.stats # for the pan-genome size

pan-genome.wga.core.stats # for the core-genome size

pan-genome.wga.newseq.stats # for the new sequence size

Since we have eight genomes, the result file contains eight lines.

Each line represents the pan-genome or core-genome or new sequence size under different number of input genomes (from 1 to 8 genomes).

Each line is tab-separated, each number is the pan-genome or core-genome or new sequence size calculated under different combinations of genomes.

the files with prefix tmp are just the intermediate files.

liufy11 · 2020-08-31T08:15:15Z

I 'm not understand you clearly. the picture is my result . there are only three samples in my project. what is  the columns means ? It means sizes or regions ? and can I get a final pangenome of the the three samples.

…

------------------ 原始邮件 ------------------ 发件人: "Wen-Biao Jiao"<notifications@github.com>; 发送时间: 2020年8月31日(星期一) 下午3:59 收件人: "schneebergerlab/AMPRIL-genomes"<AMPRIL-genomes@noreply.github.com>; 抄送: "fangying"<365698105@qq.com>; "Author"<author@noreply.github.com>; 主题: Re: [schneebergerlab/AMPRIL-genomes] output of pangenome (#4) Hi, the output files resulted from pan-genome construction based on WGA: pan-genome.wga.conensus.stats # for the pan-genome size pan-genome.wga.core.stats # for the core-genome size pan-genome.wga.newseq.stats # for the new sequence size Since we have eight genomes, the result file contains eight lines. Each line represents the pan-genome or core-genome or new sequence size under different number of input genomes (from 1 to 8 genomes). Each line is tab-separated, each number is the pan-genome or core-genome or new sequence size calculated under different combinations of genomes. the files with prefix tmp are just the intermediate files. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wen-biao · 2020-08-31T08:41:54Z

For your case including three genomes, the last line of the file pan-genome.wga.conensus.stats should contains three numbers. Because each genome can be selected as the reference and other genomes will be compared to the reference. Each number of the last line is the pan-genome size (total length of non-redundant sequences) when three genomes are included.

If you just expect one number for the pan-genome including all your genomes, just select one of them.

liufy11 · 2020-08-31T08:59:42Z

take pan-genome.wga.conensus.stats for example .(1)why there are three columns in the first line? and what are each columns means ? (2)why there are six columns  in the second line but three columns in the third line? (3)the file only gives the size of pangenome, and don't show me regions that pangenome comes from. so I can't select and merge sequences from the original three genome to get a final pangenome fasta file. Is that right ? thanks for your patient answer.

…

------------------ 原始邮件 ------------------ 发件人: "Wen-Biao Jiao"<notifications@github.com>; 发送时间: 2020年8月31日(星期一) 下午4:42 收件人: "schneebergerlab/AMPRIL-genomes"<AMPRIL-genomes@noreply.github.com>; 抄送: "fangying"<365698105@qq.com>; "Author"<author@noreply.github.com>; 主题: Re: [schneebergerlab/AMPRIL-genomes] output of pangenome (#4) For your case including three genomes, the last line of the file pan-genome.wga.conensus.stats should contains three numbers. Because each genome can be selected as the reference and other genomes will be compared to the reference. Each number of the last line is the pan-genome size (total length of non-redundant sequences) when three genomes are included. If you just expect one number for the pan-genome including all your genomes, just select one of them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

wen-biao · 2020-08-31T09:09:48Z

Again, since you have three genomes, each genome can be selected as the reference. The first line means if you just select one genome to build pan-genome, you will have three (any one of your three genomes). All columns except the first column indicate the pan-genome size. The second line means you select two genomes from your three genomes, you have 2*3 combinations (again all genomes can be the reference),

Yes, this only gives the size of pan-genome because these scripts are just used for the project that we recently did, not for a general method of pan-genome building.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

output of pangenome #4

output of pangenome #4

liufy11 commented Aug 31, 2020

wen-biao commented Aug 31, 2020

liufy11 commented Aug 31, 2020 via email

wen-biao commented Aug 31, 2020

liufy11 commented Aug 31, 2020 via email

wen-biao commented Aug 31, 2020

output of pangenome #4

output of pangenome #4

Comments

liufy11 commented Aug 31, 2020

wen-biao commented Aug 31, 2020

liufy11 commented Aug 31, 2020 via email

wen-biao commented Aug 31, 2020

liufy11 commented Aug 31, 2020 via email

wen-biao commented Aug 31, 2020