-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BFS in *.Algorithm modules #217
Conversation
Thanks for the quick reply and no worries if it takes a while to get around to reviewing the changes! I also incorporated tests for the bfs procedures. Again, I went off the respective dfs ones, but ensured the behavior is different where it should be. I also included the cases from the discussions in PR #185 , to indicate things are working as expected. |
I ran some benchmarks with criterion on Knuth's words graph from the stanford graph base, which has 34027 edges and 5757 vertices. https://gist.github.com/jitwit/e19c2bc29126782831531755523d919b I have two implementations of bfs right now, one using a The first group of benchmarks is done on the IntMap/Sets are quite fast, so the current dfs implementation which uses them actually does better than converting to the KL representation then running Data.Graph's dfs (~20ms vs ~30ms). The bfs implementation builds a bfs forest almost as fast as For, The implementation using I build up all the graphs and run |
Compares well against fgl too! https://gist.github.com/jitwit/0c2d136cadc7a32f090fb98dda338bab |
Hi @snowleopard. Does the bfs implementation look reasonable? It uses Data.Tree's monadic unfold to build the forest. I've added tests and documentation with complexity analysis. This implementation seems to perform better than fgl's, according to some criterion benchmarks I ran (see gists above). Thanks! |
@jitwit Many thanks for finishing off the PR and for benchmarking! Apologies once again for being so slow. I like your implementation, but I left a few minor comments -- please have a look. If I understand your benchmarks correctly, your BFS implementation is much faster than the current DFS one. In this case, shall we switch |
In a few words: Using bfs for reachable is probably a good idea. Also, I think it may be possible to improve depth first search, but the current attempt gets mixed results! In many words: Yeah that seems to be the case. I think it's more accurate to say that new bfs is much faster for I ran yet more benches, and also tried reachable with bfs as well. For the example graph (n = 5757, m = 34027), bfs-reachable seems to be ~3x faster than the current reachable on I think that breadth first search is performing better for two reasons. One is that once a node is enqueued, it can be excluded from further consideration, whereas with depth first, queued nodes are not necessarily in the dfs forest. The second reason is avoiding overhead of converting representations. The first reason makes the attempt at re-implementing depth first search on AdjacencyMap a bit awkward, using Data.Tree's unfoldTreeM. I haven't come up a better solution than returning a According to previous benchmarks, that dfs attempt was faster for a good portion of graphs, but not all. eg K_1000 got horrible results. https://gist.github.com/jitwit/80447786c00a43d757e8d8de26549b8f |
@jitwit Sounds good, let's switch to using BFS for I've been also thinking about making |
P.S.: We could have a look at optimising DFS separately. |
Hm! It seems bench suite: https://github.com/haskell-perf/graphs/blob/master/bench/Alga/Graph.hs |
@snowleopard I also definitely agree with returning I'll try to think more about optimizing |
Maybe I should add a comment in the documentation about |
@jitwit Yes, that would be useful! Perhaps, you can even add it as an example, using the same bidirectional circuit graph. |
@snowleopard Hopefully I've not left any documentation/tests out of sync this time! Also, I hope the documentation changes are reasonable? Wikipedia seems to define level structure mentioning undirected graphs, but no mention of directed ones, so I can edit if it goes against the standard definition. https://en.wikipedia.org/wiki/Level_structure#cite_note-dps-1 |
@jitwit The performance suite has been fixed (thanks @nobrakal!) and now we have the following improvement (https://travis-ci.org/snowleopard/alga/jobs/558388854):
That's pretty good! Note that this is benchmarking |
Cool! Nice to see the benefits play out |
@jitwit I think it's pretty much ready for merge, but see a couple of more comments. P.S.: Please add your name to |
@snowleopard Nice! Also, fingers crossed I didn't get any tests/docs out of sync. I tried to look carefully! |
@jitwit Merged, thank you! If you are not fed up with the long review times, it would be great if you could look into improving DFS algorithms, as discussed above :) |
@snowleopard nice! haha, definitely, I'll start trying some ideas for DFS |
First draft. Took a stab at adding bfs search. I went off the api for the dfs procedures and used sets from containers to keep track of state. The unfoldTreeM_BF uses a sequence under the hood for the queue.
If this looks ok, I take it the next steps would be to add tests and documentation? Otherwise, at least in the case of AdjacencyIntMaps, perhaps it would be worth using Arrays (as Data.Graph does)?
Joseph