-
Notifications
You must be signed in to change notification settings - Fork 147
[RFC] Graph API
Romain Dorgueil edited this page Oct 8, 2017
·
1 revision
Subject: | Graph API |
---|---|
Authors: | Michael Copeland
Romain Dorgueil |
Created: | May 25, 2017 |
Modified: | Jul 6 2017 |
Target: | 1.0 |
Status: | Draft |
The purpose of this page is to generate ideas around the implementation of the Graph API. Comments are welcome!
Comments:
-
- @jelloslinger
-
- thumbs up on the current API (seems that is all you need for a minimum viable product)
- of all the suggestions below, in favor of Future Proposal #1 (operators)
- any convenience notation/operators/methods should use the current API under the hood. Most (if not all) features of the current API would need to be expressible.
- not sure if I'm a fan of using inheritance at this point
-
- @hartym
-
- we need something that allows graphs to be defined in a logical order. The API should be intuitive to developers familiar with ETL and some programming.
- As an example in the current API, you need to explicitly specify which chain to "end first" if multiple chains are specified at input. The new proposal should make this more apparent.
Graph API is for now very minimalist, and that was a choice. Better not implement too much before we exactly know what we want. Now that the standard library starts to be a bit more stuffed, let's think about how we can enhance this API.
- Better developer experience.
- Define graphs with forks and joins in a few different ways: * Graph subclass ? * «bubble» * Factory
- There should be a way to reference any point in the graph, and it should allow to have more than one node containing the same value.
- There should be a way to extend a graph, either by inserting new nodes, removing some nodes, or overriding some nodes.
- Graph visualization (using graphviz).
digraph G {
rankdir = LR;
A -> B;
B -> C -> C1;
B -> D -> D1;
}
# Current API
graph = bonobo.Graph()
graph.add_chain(A, B)
graph.add_chain(C, C1, _input=B)
graph.add_chain(D, D1, _input=B)
# Future proposal 1, using operators
graph = bonobo.Graph()
graph += (A, B)
graph[B] += (C, C1)
graph[B] += (D, D1)
# ... or using inheritance
class MyGraph(bonobo.Graph):
def setup(self): # setup? init? how to pass arguments?
self += (A, B)
self[B] += (C, C1)
self[B] += (D, D1)
# Future proposal 2:
graph = bonobo.Graph()
graph.append(A, B) # implicit "BEGIN"
# ...or graph[BEGIN].append ?
graph[B].append(C, C1)
graph[B].append(D, D1)
# pro : this is "list-like"
# con : can't have twice the same node in the graph, but maybe can overcome that with some way to specify
# which one we talk about if there is ambiguity ?
digraph G {
rankdir = LR;
A1 -> A2 -> C;
B1 -> B2 -> C;
C -> D;
}
graph = bonobo.Graph()
graph.add_chain(C, D, _input=None, _name='trunk')
graph.add_chain(A1, A2, _output='trunk')
graph.add_chain(B1, B2, _output='trunk')
graph = bonobo.Graph()
graph += (A1, A2)
graph += (B1, B2)
graph[(A1, A2)] += (C, D) # ???
digraph G {
rankdir = LR;
A -> B;
B -> C1 -> C2 -> F;
B -> D1 -> D2 -> F;
B -> E1 -> E2 -> F;
F -> G;
}
graph = bonobo.Graph()
graph.add_chain(A, B)
graph.add_chain(F, G, _input=None, _name='trunk')
graph.add_chain(C1, C2, _input=B, _output='trunk')
graph.add_chain(D1, D2, _input=B, _output='trunk')
graph.add_chain(E1, E2, _input=B, _output='trunk')
graph = bonobo.Graph()
graph += (A, B)
graph[B] += (C1, C2)
graph[B] += (D1, D2)
graph[B] += (E1, E2)
graph[(C2, D2, E2)] += (F, G) # ???
gf = GraphFactory()
gf |= foo | bar | baz
gf[foo] |= a | b | c
graph = gf()
Operator ? Pipes, or plus, maybe some way to have a "pillar" of transformations