-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share workflow for visualizing FastANI results #100
Comments
This can be helpful for many users. Thanks for sharing. |
Thanks a lot for making this code!
Any idea what was wrong? |
I think that error occurs when the If it still doesn't seem to work, please upload your |
Guess the issue is still there as now I got:
I have uploaded my *.visual file zipped Thanks a lot for the prompt support!! |
How did you get the plot with just the '.visual' file? |
Ah, now I know what caused the error. You have the wrong file set for the third argument.
|
🙈 what a dumb error...sorry and thanks for answering!
If I run the plotting function in R from the fastANI repository ( |
Probably your input FASTA file is multi-record, right? I have modified visualize.py to work with multi-record FASTA files, so please try again with the modified program. |
It works now, thanks for the fast fixing! |
Yes.
Yes. |
Hi, genome cluster |
From the clustering ANI matrix (ANIclustermap_matrix.tsv) generated by ANIclustermap, it is possible to create a table like the one you show in your example. Example codeBelow is an example code ( import sys
import pandas as pd
# Get argument
args = sys.argv
matrix_tsv_file = args[1] # ANIclustermap_matrix.tsv
cluster_ani_thr = float(args[2]) # e.g. 95.0
cluster_tsv_file = args[3] # Output cluster table file
# Parse cluster ANI matrix
df = pd.read_table(matrix_tsv_file)
cluster_id = 1
cluster_base_idx = 0
cluster_size_record = 1
genome_name2cluster_id = {}
for i, genome_name in enumerate(df.columns):
cluster_candidate_df = df.iloc[cluster_base_idx : i + 1, cluster_base_idx : i + 1]
ani_thr_match_count = (cluster_candidate_df > cluster_ani_thr).sum().sum()
if ani_thr_match_count != cluster_size_record**2:
cluster_id += 1
cluster_base_idx = i
cluster_size_record = 1
genome_name2cluster_id[genome_name] = cluster_id
cluster_size_record += 1
# Output cluster table
cluster_table_dict = {
"genome": genome_name2cluster_id.keys(),
"cluster_id": genome_name2cluster_id.values(),
}
cluster_table_df = pd.DataFrame(cluster_table_dict)
cluster_table_df.to_csv(cluster_tsv_file, sep="\t", index=False) Example CommandYou can create a cluster table by the following command. Use example dataset here.
cluster_table.tsv
|
This is really cool :-) |
Hi again, |
Hi, @shlomobl
Yes. ANIclustermap will output the following message if it reuses the results of a previous ANI calculation.
|
Hi,
This is not an issue report, but I would like to share my workflow for visualizing FastANI results as I think it will be useful for other users.
1. Visualize all-vs-all ANI matrix
I developed a Python tool ANIclustermap for visualization of ANI calculation results using FastANI, which can automatically perform the workflow from ANI calculation among all-vs-all genomes to clustering and visualization of ANI matrix. It can output the following figure, clustered ANI matrix, newick format dendrogram.
2. Visualize Conserved Regions b/w Two Genomes
A visualization R script using genoPlotR is already provided, but I thought it would be nice to have one for Python users, so I created a Python script (
visualize.py
) that can plot the following figure. See repository for details of the script.Example 1.
colormap=
hsv
, link_color=grey
, curve=False
(Default)Example 2.
colormap=
viridis
, link_color=red
, curve=True
Hope this information helps other users.
Thanks.
The text was updated successfully, but these errors were encountered: