Skip to content
This repository has been archived by the owner on Oct 8, 2021. It is now read-only.

discussion: API changes #665

Closed
4 tasks done
sbromberger opened this issue Jun 23, 2017 · 21 comments
Closed
4 tasks done

discussion: API changes #665

sbromberger opened this issue Jun 23, 2017 · 21 comments
Labels
decision requires a decision to be made on appropriate action discussion requires in-depth discussion - participation encouraged
Milestone

Comments

@sbromberger
Copy link
Owner

sbromberger commented Jun 23, 2017

Proposal:

This would simplify the API to the following functions:

  • nv
  • ne
  • vertices
  • edges
  • is_directed
  • has_vertex
  • has_edge
  • in_neighbors
  • out_neighbors
  • zero

All changes would move through deprecation for at least one minor version.

cc @jpfairbanks @juliohm

@sbromberger sbromberger added decision requires a decision to be made on appropriate action discussion requires in-depth discussion - participation encouraged labels Jun 23, 2017
@jpfairbanks
Copy link
Contributor

This came out of a discussion at JuliaCon about what if any difference there is between fadj/badj and in_neighbors/out_neighbors. Seth proposes that fadj/badj is an iterable of the edges that leave/enter the vertex, while in_neighbors/out_neighbors is an iterable over the set of neighbors which ignores the multiplicity in a multigraph.

Since we have never supported Multigraphs, this semantic debate was never an issue. Almost none of our algorithms would correctly handle multigraphs if we just dropped them in without modifying the algorithms.

Does anyone have strong opinions about these names? /bikeshed.

I agree that we should have different names for these concepts. A quick google search shows that "Forward Adjacencies" has been used in RFC4655.
The phrase appears once in this paragraph:

There is benefit in knowing which TE LSPs exist, and their routing,
to support such applications as placing a high-priority TE LSP in a
crowded network such that it preempts as few other TE LSPs as
possible (also known as the "minimal perturbation" problem). Note
that preempting based on the minimum number of links might not result
in the smallest number of TE LSPs being disrupted. Another
application concerns the construction and maintenance of a Virtual
Network Topology [MLN]. It is also helpful to understand which other
TE LSPs exist in the network in order to decide how to manage the
forward adjacencies that exist or need to be set up. The cost-
benefit of stateful PCE computation would be helpful to determine if
the benefit in path computation is sufficient to offset the
additional drain on the network and computational resources.

And in this comment in BGL's implementation of Parallel Brandes' Algorithm:

// Mark forward adjacencies in DAG of shortest paths

I think that neighbors is the more fundamental concept and the set of outgoing edges in a multigraph is the secondary concept with a representation that depends on the implementation details of the multigraph.

@sbromberger
Copy link
Owner Author

sbromberger commented Jun 23, 2017

So actually, I just took a look at the code, and I realize that I misspoke yesterday. fadj and badj are not in interface.jl. They're not even exported from simplegraphs, just used as accessors. Therefore, the correct contract is already with in_neighbors and out_neighbors.

So I'll check off that first idea as "already done" - there is no need for every graph type to support fadj and badj. Just out_neighbors and in_neighbors, but with the caveat that they should not be altered (they should be treated as immutable).

I would still appreciate some discussion on the remaining issues.

@nickeubank
Copy link
Contributor

First, a propos #642, what exactly is distinction between the API and core? I assume by API you mean "if you make a custom graph Type, it must offer those functions". Does that imply that in moving into the core, we moving these functions inside SimpleGraphs, and people making their own Types don't have to offer those functions?

(If I follow that right, then) I think those basic modifications are so core to network analysis they belong in the API. I suppose one can measure descriptives without them, but things like clique percolation, robustness measures, and things like Louvain algorithm all require graph modifications. Never mind just building up graphs piecemeal.

@jpfairbanks
Copy link
Contributor

Yeah the idea is that the interface is the minimal amount of functionality your type must provide in order to get the rest of the library functions to work on your type. There can be additional functions that you have to get in order to achieve optimal performance, but we don't really have a concrete plan for that yet.

@sbromberger
Copy link
Owner Author

sbromberger commented Jul 11, 2017

To elaborate a bit further, this is how I think about it as a three-tiered system:

API - stuff that must be implemented in order for basic functionality tests to pass - these functions are the building blocks for everything else. These include accessors for the graph structures.

Core - functions that should be implemented (and/or optimized) in order to enable more complex features. Things like indegree and outdegree are included here. These functions typically use API directly as their building blocks.

Everything else - functions that use core or API to do some other stuff, and that don't necessarily serve as building blocks themselves. Examples would be the centrality measures, and a counter-example to the latter criterion would be dijkstra_shortest_paths.

@jpfairbanks
Copy link
Contributor

Yes the Core functionality should be stuff that is defined in terms of the API but might need to be reimplemented by a type to improve performance. and the Everything else should all be completely oblivious to the fact that there are multiple types of graphs underneath.

So things like merge would be Core because it is not necessary that every graph type support efficient merging of vertices, but community detection is in the Everything else category building on Core operations. If your graph type doesn't have an efficient merge, then it won't have an efficient community detection either, but that is ok, we need to either find a way to implement merge efficiently or use a different data structure.

@sbromberger
Copy link
Owner Author

So things like merge would be Core

I agree with this (and realize it's a change from my original position of having this in operators).

@nickeubank
Copy link
Contributor

@sbromberger that's a very helpful summary, thanks. The README.md has a single category called "Core API" so I was getting a little confused.

In that case, my inclination is to still have the constructors/modifiers as API, but I understand if they get shifted to core.

@sbromberger
Copy link
Owner Author

The key for me is this: if we make it part of the API, then we're saying that every single graph type must implement this function. This includes immutable graphs and ones that might be backended by an external datastore. Immutable graphs can't / won't have any modifiers by definition, which is why I think these belong in core.

You're 100% right about the README being ambiguous. We need to fix that.

@nickeubank
Copy link
Contributor

nickeubank commented Jul 12, 2017 via email

@sbromberger
Copy link
Owner Author

sbromberger commented Nov 21, 2017

Coming back to this. I don't think I like in_degree replacing indegree since we have indegree_centrality and in_degree_centrality seems awkward.

I think I'd prefer renaming in_neighbors and out_neighbors to inneighbors and outneighbors for consistency, or maybe just making neighbors(...; dir=:in) instead. (Though this might have performance implications.)

@jpfairbanks - thoughts?

@sbromberger sbromberger added this to the 1.0 milestone Nov 21, 2017
@jpfairbanks
Copy link
Contributor

I would prefer neighbors(g, v, :in) as long as it is fast enough. inneighbors has 1 too many ns and a keyword arg is a pain. For a real C style api neighbors(g, v, DIR_IN) neighbors(g, v, DIR_OUT)

@sbromberger
Copy link
Owner Author

sbromberger commented Jan 24, 2018

Here's my latest idea (hat tip @ararslan):

neighbors(g, v, ::Val{:in}) = ...
neighbors(g, v, ::Val{:out}) = ...
neighbors(g, v, ::Val{:both}) = ...
neighbors(g, v, d::Symbol=:out) = neighbors(g, v, Val(d))

similarly,

degree(g, v, ::Val{:in}) = ...
degree(g, v, ::Val{:out}) = ...
degree(g, v, ::Val{:both}) = ...
degree(g, v, d::Symbol=:out) = degree(g, v, Val(d))

@jpfairbanks - thoughts?

(see also JuliaLang/julia#25715)

@jpfairbanks
Copy link
Contributor

I like it!

@matbesancon
Copy link
Contributor

other possibility is using empty structs

abstract type DegreeDirection end
struct InDegree <: DegreeDirection end
...
degree(g, v, ::InDegree) = ...
...

This would come to the same design as maximum_flow which takes a FlowAlgorithm argument
I think the compile-time dispatch is still guaranteed. Which do you think would fit best for these?

@sbromberger
Copy link
Owner Author

sbromberger commented Jan 24, 2018

This would come to the same design as maximum_flow which takes a FlowAlgorithm argument

Yes. The history behind using structs for dispatch in LightGraphs is twofold: first, Graphs.jl used them, and a lot of our code came from that package. second, Val wasn't an option.

On the flip side: where we're using empty structs, we should evaluate whether Val could be used instead. GraphIO.jl could benefit.

@sbromberger
Copy link
Owner Author

degree(g, v, ::InDegree) = ...

This bothers me, because it's redundant, and we'd need to have InNeighbors, and In* for other things.

@jpfairbanks
Copy link
Contributor

This is a problem for all functions that currently take a neighborfunction those functions currently would require you to make an anonymous function. Or "lift" the :in/:out vals up to their argument lists.

@sbromberger
Copy link
Owner Author

This is a problem for all functions that currently take a neighborfunction those functions currently would require you to make an anonymous function.

This is a key point. We do this throughout the code for the degree functions. Alright, I think we're back to just renaming them in_degree and out_degree.

@jpfairbanks
Copy link
Contributor

Yeah, and thinking for metagraphs you might have weighted_degree property_degree` which count the degree according to some other definition than the number of edges such as sum of edge weights or number of edges that satisfy some predicate.

@sbromberger
Copy link
Owner Author

I pulled 0.13.0 because I'd really like to get this in first.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
decision requires a decision to be made on appropriate action discussion requires in-depth discussion - participation encouraged
Projects
None yet
Development

No branches or pull requests

4 participants