-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute and process Differentiation Request graph #873
Conversation
@vgvassilev One improvement that we can make in this PR itself is to fix the printing order of generated functions; this can be done because we have dynamically created the graph during traversals. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #873 +/- ##
=======================================
Coverage 94.81% 94.81%
=======================================
Files 51 52 +1
Lines 7620 7701 +81
=======================================
+ Hits 7225 7302 +77
- Misses 395 399 +4
... and 2 files with indirect coverage changes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 10 out of 18. Check the log or trigger a new build to see more.
I think so, is there a catch? |
The only issue I see is that diagnostic messages won't align with the sequence of generated functions in the code dump. So, maybe it's a little bit tough for debugging. |
Hm... I am confused - how is it different from now? |
Currently, if the order of differentiation is f1 followed by f2. Then, any diagnostic warning or error in f2 will appear after f1 has been dumped/printed and before f2 is dumped. |
Let's have this improvement in a separate PR and then we can discuss further. What do you think? |
Sounds good 👍🏼 |
153c3a1
to
66beec3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
m_DerivativeBuilder.reset( | ||
new DerivativeBuilder(S, *this, m_DFC, m_DiffRequestGraph)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use std::make_unique instead [modernize-make-unique]
m_DerivativeBuilder.reset( | |
new DerivativeBuilder(S, *this, m_DFC, m_DiffRequestGraph)); | |
m_DerivativeBuilder = std::make_unique<DerivativeBuilder>( | |
S, *this, m_DFC, m_DiffRequestGraph); |
@@ -122,11 +122,13 @@ | |||
Sema& S = m_CI.getSema(); | |||
|
|||
if (!m_DerivativeBuilder) | |||
m_DerivativeBuilder.reset(new DerivativeBuilder(S, *this, m_DFC)); | |||
m_DerivativeBuilder.reset( | |||
new DerivativeBuilder(S, *this, m_DFC, m_DiffRequestGraph)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: initializing non-owner argument of type 'std::unique_ptrclad::DerivativeBuilder::pointer' (aka 'clad::DerivativeBuilder *') with a newly created 'gsl::owner<>' [cppcoreguidelines-owning-memory]
new DerivativeBuilder(S, *this, m_DFC, m_DiffRequestGraph));
^
66beec3
to
b2ed85b
Compare
lib/Differentiator/DiffPlanner.cpp
Outdated
@@ -684,7 +685,7 @@ namespace clad { | |||
llvm::SaveAndRestore<const FunctionDecl*> saveTopMost = m_TopMostFD; | |||
m_TopMostFD = FD; | |||
TraverseDecl(derivedFD); | |||
m_DiffPlans.push_back(std::move(request)); | |||
m_DiffRequestGraph.addNode(request, true /*isSource*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_DiffRequestGraph.addNode(request, true /*isSource*/); | |
m_DiffRequestGraph.addNode(request, /*isSource=*/true); |
There must be a clang-tidy check, why it did not warn here - can you check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after a lot of trials, I can't figure it out. I tried this option with multiple configurations of its options: https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html.
Still doesn't work, changed it manually for now.
if (!firstDerivative) { | ||
// Derive declaration of the the forward mode derivative. | ||
IndependentArgRequest.DeclarationOnly = true; | ||
firstDerivative = plugin::ProcessDiffRequest(CP, IndependentArgRequest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not get rid of plugin::ProcessDiffRequest
, but do we have a plan to remove it. I think that's a layering violation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason we need this is for the forward declaration of derivative signatures to complete the call expressions. Once we get the graph in the first pass itself (with the help of activity analysis), we can proceed with topologically ordered differentiation (instead of breadth-first as we are doing now), and then this forward declaration will not be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep track of that discussion as an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a related issue here: #857, added a comment in it referencing this PR.
1a553a5
to
1ee7b35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Plan for dynamic graph - The relations between different differentiation requests can be modelled as a graph. For example, if `f_a` calls `f_b`, there will be two differentiation requests `df_a` and `df_b`, the edge between them can be understood as `created_because_of`. This also means that the functions called by the users to be explicitly differentiated (or `DiffRequests` created because of these) are the source nodes, i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases, the graph depends on the internal implementation, like the Hessian computation, which requires creating multiple `fwd_mode` requests followed by a `rev_mode` request. - We can use this graph to order the computation of differentiation requests. This was already being done implicitly in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the called function; this was sort of like a depth-first search strategy. - This had problems, as `Clang` reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: vgvassilev#745. - To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach was implemented in this PR: vgvassilev#848. - Although it fixed the problem, the graph traversal was still implicit. We needed some way to compute/store the complete graph and possibly optimize it, such as converting edges to model the `requires_derivative_of` relation. Using this, we could proceed with differentiation in a topologically sorted ordering. - It also required one caveat: although we don't differentiate the called function completely in a recursive way, we still need to declare it so that we can have the call expression completed (i.e. `auto t0 = f_pushforward(...)`). - To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the `DiffRequest` will be formed inside the visitors (including arguments or `DVI` info). This whole approach will require activity analysis in the first pass. - As an incremental improvement, the first requirement was to implement infrastructure to support explicit modelling of the graph and use that to have a breadth-first traversal (and eventually topological ordering). This is the initial PR for capturing the differentiation plan in a graphical format. However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for `pushforward` and `pullbacks`. This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: 82c0b42.
1ee7b35
to
ed6a89a
Compare
Plan for dynamic graph
The relations between different differentiation requests can be modelled as a graph.
For example, if
f_a
callsf_b
, there will be two differentiation requestsdf_a
anddf_b
, the edge between them can be understood ascreated_because_of
.This also means that the functions called by the users to be explicitly differentiated (or
DiffRequests
created because of these) are the source nodes, i.e. no incoming edges. In most cases, this graph aligns with the call graph, but in some cases, the graph depends on the internal implementation, like the Hessian computation, which requires creating multiplefwd_mode
requests followed by arev_mode
request.We can use this graph to order the computation of differentiation requests. This was already being done implicitly in the initial recursive implementation. Whenever we encountered a call expression, we started differentiation of the
called function; this was sort of like a depth-first search strategy.
Clang
reported errors when it encountered a new function scope (of the derivative of the called function) in the middle of the old function scope (of the derivative of the callee function). It treated the nested one like a lambda expression. The issue regarding this: Possible memory leak when differentiating call expressions in forward mode #745.To fix this, an initial strategy was to eliminate the recursive approach. Hence, a queue-based breadth-first approach was implemented in this PR: Restructure differentiation schedule into a breadth first traversal #848.
requires_derivative_of
relation. Using this, we could proceed with differentiation in a topologically sorted ordering.auto t0 = f_pushforward(...)
).To move towards the final stage of having the complete graph computed before starting the differentiation, we need the complete information on how the
DiffRequest
will be formed inside the visitors (including arguments orDVI
info).This whole approach will require activity analysis in the first pass.
This is the initial PR for capturing the differentiation plan in a graphical format.
However, the traversal order is still breadth-first, as we don't have the complete graph in the first pass - mainly because of a lack of information about the args required for
pushforward
andpullbacks
.This can be improved with the help of activity analysis to capture the complete graph in the first pass, processing the plan in a topologically sorted manner and pruning the graph for user-defined functions. I started this with this approach, and the initial experimental commit is available here for future reference: vaithak@82c0b42.