Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multidigraph_to_digraph transitive reduction option #16

Merged
merged 4 commits into from
Nov 23, 2021

Conversation

dhimmel
Copy link
Member

@dhimmel dhimmel commented Nov 23, 2021

background information at https://twitter.com/larsjuhljensen/status/1450188835032375300

Ontologies such as GO can be reduced to minimum equivalent graph when collapsing multiple relationship types into a single relationship type DiGraph.

background information at
https://twitter.com/larsjuhljensen/status/1450188835032375300

Ontologies such as GO can be reduced when collapsing multiple relationship
types into a single relationship type DiGraph.
@dhimmel
Copy link
Member Author

dhimmel commented Nov 23, 2021

GO edges removed by reduction

Transitive reduction of GO went from 87,436 to 79,946 by removing 7,490 redundant edges.

Examples of removed edges:

  • GO:0002213 --> GO:1900366
    defense response to insect --> negative regulation of defense response to insect
  • GO:0016331 --> GO:0021994
    morphogenesis of embryonic epithelium --> progression of neural tube closure
  • GO:0033621 --> GO:0120272
    nuclear-transcribed mRNA catabolic process, meiosis-specific transcripts --> positive regulation of nuclear-transcribed mRNA catabolic process, meiosis-specific transcripts
  • GO:0034645 --> GO:2000113
    cellular macromolecule biosynthetic process --> negative regulation of cellular macromolecule biosynthetic process
  • GO:0035710 --> GO:2000515
    CD4-positive, alpha-beta T cell activation --> negative regulation of CD4-positive, alpha-beta T cell activation
  • GO:0043500 --> GO:0014745
    muscle adaptation --> negative regulation of muscle adaptation
  • GO:0044839 --> GO:1902750
    cell cycle G2/M phase transition --> negative regulation of cell cycle G2/M phase transition
  • GO:0097695 --> GO:1904914
    establishment of protein-containing complex localization to telomere --> negative regulation of establishment of protein-containing complex localization to telomere
  • GO:1900868 --> GO:1900972
    sarcinapterin biosynthetic process --> negative regulation of sarcinapterin biosynthetic process
  • GO:1990048 --> GO:1901952
    anterograde neuronal dense core vesicle transport --> negative regulation of anterograde dense core granule transport

Source

import random
import networkx as nx
import pronto
from nxontology.imports import pronto_to_multidigraph, multidigraph_to_digraph
url = "http://release.geneontology.org/2021-02-01/ontology/go-basic.json.gz"
go_pronto = pronto.Ontology(handle=url)
go_multidigraph = pronto_to_multidigraph(go_pronto)
go_digraph_full = multidigraph_to_digraph(go_multidigraph, reduce=False)
go_digraph_reduced = multidigraph_to_digraph(go_multidigraph, reduce=True)
removed_edges = sorted(go_digraph_full.edges - go_digraph_reduced.edges)
print(f"Transitive reduction of GO went from {go_digraph_full.number_of_edges():,} to {go_digraph_reduced.number_of_edges():,} by removing {len(removed_edges):,} redundant edges.")
random.seed(0)
sample_removed_edges = sorted(random.sample(removed_edges, k=10))
for source, target in sample_removed_edges:
    print(f"- `{source}` --> `{target}`\n   {go_pronto.get_term(source).name} --> {go_pronto.get_term(target).name}")

@dhimmel
Copy link
Member Author

dhimmel commented Nov 23, 2021

Visualizing redundant edges

From AmiGo under Graph Views (Graphical View PNG), here are the ancestors (superterms) for GO:1900366 (negative regulation of defense response to insect, first example in last coment)

image

Notice the red edge from "negative regulation of defense response to insect" to "defense response to insect". This edge (along with other red edges and some black edges) are be removed by the transitive reduction.

@dhimmel
Copy link
Member Author

dhimmel commented Nov 23, 2021

Hat tip to @eric-czech for first alerting me that read_gene_ontology was producing redundant edges that could be removed by transitive reduction. @eric-czech IIRC you mentioned the reduction step was prohibitively slow. But it's only taking a couple seconds for me on GO. Maybe you were referring to a different ontology? Hence, I think it makes sense to apply the reduction by default to GO.

@dhimmel dhimmel merged commit fc2e92f into main Nov 23, 2021
@dhimmel dhimmel deleted the transitive-reduction branch November 23, 2021 21:24
@eric-czech
Copy link

you mentioned the reduction step was prohibitively slow. But it's only taking a couple seconds for me on GO. Maybe you were referring to a different ontology?

Hmm could be, though I thought I had attempted it on GO. I was working with the reverse orientation of the edges so maybe that influences the running time a lot?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants