Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/issue 1071 ode time varying #1072

Merged
merged 35 commits into from
May 10, 2019
Merged

Conversation

wds15
Copy link
Contributor

@wds15 wds15 commented Dec 1, 2018

Summary

This PR achieves two goals:

  1. This PR allow the initial time and the time-vector passed into ODE integrators to be varying.
  2. The decoupling operation is taken out of the coupled_ode_system and instead placed in the coupled_ode_observer function object.

Note that this PR supersedes PR #857 in that this one also allows the initial time to be varying which I think is key in making it a useful thing to have time varying. Moreover, this PR has a different approach by refactoring things a bit. In particular, the decoupling operation is moved into the coupled_ode_observer function object which now handles all relevant cases using operands_and_partials.

A reason to pull out the decoupling operation out of the coupled_ode_system is to prepare for a future integration of a coupled_ode_system for the case whenever the user provides analytic Jacobians. In this case it make sense to have the decoupling operation as a separate module. This isn't the motivation anymore... the code makes more sense to be organized this way from my perspective. The thing is that the sensitivites wrt to time always work the same and the specialization's by the coupled_ode_system of the decoupling operation can easily be integrated in a single function which uses operands_and_partials.

Finally, this refactor makes the code use less temporaries since the coupled states are directly written into the AD tree (instead of all being stored and then moved to the AD tree).

This PR is somewhat larger than #857 one as this one tests the decoupling of the ode observer specifically which was not really tested for as I can see it (the respective decoupling operation).

Tests

The additional tests introduced in PR #857 have been put into this PR as well and have been extended even further to also cover the case for the varying initial time. These additional tests are in

  • test/unit/math/rev/mat/functor/integrate_ode_adams_rev_test.cpp
  • test/unit/math/rev/mat/functor/integrate_ode_bdf_rev_test.cpp
  • test/unit/math/rev/arr/functor/integrate_ode_rk45_test.cpp

The tests of the integrate functions now use a forced harmonic oscillator which involves explicitly time on the ODE RHS. This is to ensure that we get the right time-points aligned (which we do now).

Moreover, the decoupling tests which were part of coupled_ode_systemhave been moved to respective tests for the coupled_ode_observer:

  • test/unit/math/rev/arr/functor/coupled_ode_observer_test.cpp
  • test/unit/math/prim/arr/functor/coupled_ode_observer_test.cpp

Side Effects

I had to pull in the specialization for ops_partial_edge for std::vector<var> from the mat branch in the arr folder.

Decoupling operation is now moved from the coupled_ode_system to the coupled_ode_observer.

The rough flow of the integrate_ode calls changes now slightly:

old behavior:

  1. check user inputs
  2. initialize coupled_ode_system depending on whats needed
  3. initialize an empty double only y_coupled buffer for the coupled outputs
  4. integrate and store things in y_coupled
  5. decouple things and put them on the AD stack

new:

  1. check user inputs
  2. initialize coupled_ode_system depending on whats needed
  3. initialize an empty output vector (which has the type just like the output)
  4. integrate and call for each output directly coupled_ode_observer which decouples the state on the spot and adds it to the AD stack
  5. nothing to be done...just return the final output

Checklist

class ops_partials_edge<double, std::vector<var> > {
public:
typedef std::vector<var> Op;
typedef std::vector<double> partials_t;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This used to be a Eigen type (a vector_d) and I changed it to be a std::vector to make things work with in the arr area of this specialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved it back as otherwise tests fail big time.

@syclik syclik mentioned this pull request Dec 4, 2018
3 tasks
@syclik
Copy link
Member

syclik commented Dec 6, 2018

@wds15, can you describe the overall strategy before I dig in? (I took a quick scan, but it's shifting bits of code to different files in something where I'll need a little more time to review properly.)

Is there a good description / doc of the architecture of the internals of integrate_ode? If not, I think it's about time we put one together because it'll save us time in the future as we continue to work on this.

@wds15
Copy link
Contributor Author

wds15 commented Dec 8, 2018

The main shift in this PR is moving the decoupling operation out of coupled_ode_system as described above. I added a few more notes what that means for the integrate functions themselves.

What is always useful for me to have around is the document here:

ode_system.pdf

The math there is how we coded things (as documented in the coupled_ode_system files)... but the writeup uses matrix notation which makes the index war a little easier to digest (to me).

@wds15 wds15 changed the title Feature/issue 1071 ode time varying [WIP] Feature/issue 1071 ode time varying Dec 11, 2018
@wds15 wds15 changed the title [WIP] Feature/issue 1071 ode time varying Feature/issue 1071 ode time varying Dec 17, 2018
@wds15
Copy link
Contributor Author

wds15 commented Jan 7, 2019

This PR is ready for review. Should it help to have a video chat on this piece then I can jump on a hangout call. I know that this is dense material, but the ODE system is tested decently which should ease the review, hopefully. These changes should be helpful for inference in pharmacokinetic ODE systems with delays (and I am sure there are more applications).

@syclik
Copy link
Member

syclik commented Jan 7, 2019

@wds15, thanks for putting up the link to the ode_system.pdf. That's not exactly what I'm looking for.

It would really help if you described how the different things interacted together. What does coupled_ode_observer do? (There is doc there, but it's really lacking in describing what the thing does.) How does that interact with coupled_ode_system? How do these things work with integrate_ode_*()? That's sort of the high-level description we need somewhere.

@syclik
Copy link
Member

syclik commented Jan 7, 2019

Once I have some sense of what's going on, I'll review it.

@syclik syclik self-requested a review January 7, 2019 16:59
@wds15
Copy link
Contributor Author

wds15 commented Jan 7, 2019

Ah... that's what you are asking for.

  • integrate_ode_*

    1. check user input
    2. initialize coupled_ode_system which represents the ODE RHS for the base system and the ODE RHS for the sensitivity system. The functor gets as input time and the state of the coupled ODE RHS system which is essentially a double only array of the system.
    3. Initialize the specific integrator
    4. integrate and record states at requested solution times with the observer
  • coupled_ode_observer is used to record every state of the coupled ODE system at the time-points at which the solution is requested by the user. For every call of operator() of this functor the double only input array of the coupled ODE system (base ODE solution & sensitivities at time-point t) is translated with the operands_and_partials approach into the stan system of representing function values and their respective gradient wrt to the input operands.

I hope this makes sense now. Let me know if something is unclear or if we should shortly have a hangout.

@syclik
Copy link
Member

syclik commented Jan 22, 2019

@wds15, I think it'd help if we chatted about your changes.

First off, this shouldn't have passed tests -- you're bringing in rev into prim. And mat into arr. We might be able to let that second one slide, but not the first.

Let me know when you have some time.

@charlesm93 charlesm93 self-assigned this Mar 28, 2019
@charlesm93
Copy link
Contributor

Hi, I have time this week to check the doc and do one more pass of the PR. Hopefully we can then merge the PR if everything goes well.

@wds15
Copy link
Contributor Author

wds15 commented May 8, 2019

@charlesm93, @syclik Do you think to have time to finalize this PR at some point? It would be good to have a timeline for this. Otherwise we can either look for other reviewers or we just delete this PR. It's too much resource keep this hanging around.

Tagging @seantalts or @bob-carpenter who may have ideas as well.

@syclik
Copy link
Member

syclik commented May 8, 2019 via email

@wds15 wds15 mentioned this pull request May 8, 2019
5 tasks
@charlesm93
Copy link
Contributor

I'll go through this today and do a full review of the PR. I'll rely on the info in the issue and the summary of the PR (first comment on this thread) to gage if the documentation is sufficient. I'll also look whether the comments in @syclik 's review were addressed.

@syclik
Copy link
Member

syclik commented May 9, 2019 via email

@wds15
Copy link
Contributor Author

wds15 commented May 9, 2019

@charlesm93 great...thanks. I slightly updated the PR description with things I saw being relevant to change. If more should go to the PR description, let me know, before you waste time.

Copy link
Contributor

@charlesm93 charlesm93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me. I think the documentation is good, and the code reads well. There are some minor details and questions that should be addressed before merging. I have considered @syclik 's comment while going through the code, so addressing my comments should resolve both reviews.

*
* @param coupled_state solution at the specified time.
* @param t time of solution.
* @param t time of solution. The time must correspond to the ts
* vector, respectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure what this means. Should it be the elements in time correspond to elements in the ts vector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. You must call the operator only for those times which are in the vector passed in at construction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I think this would be worth rephrasing a little then.

const std::size_t M_;
const std::size_t index_offset_theta_;
// counter pointing to the element in the ts vector which is
// currently being referred to. Initialized to -1.
int n_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this came up in the PR comments, but it should be justified here too. Why do we initialize n_ at -1? To be used, n_ should at least be 0. Would it be possible to initialize it at 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It points to the current element which is one before 0 at construction. The last time this came up things were more complicated as i needed to do branching due to different ode solver conventions. This branching was moved into the ode solver functions which is where they belong, i think. Right now i increase n when you enter the function for recording a state. If you like 0 for the init i would need to move that increment to the end of the function. I donˋt mind either convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't mind either convention, but I slightly prefer the second option. First, initializing n_ at 0 seems more natural and makes the code clearer. Secondly, for future development, we may add other functions that use n_, and we don't want to impose the requirement that n_ needs to be incremented.

So either move the increment function or add one line of comment in the construct explaining the choice of starting at -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have a look at what is there now. Hopefully easier to digest than before.

*/
template <typename F, typename T1, typename T2, typename T_t0, typename T_ts>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion: I like the type names T_t0 and T_ts, more than T1 and T2. I realize numbering types is consistent with what's done in Stan, but in the future, it would be nice to have something like T_init and T_parm, or T_y0 and T_theta.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree...and we both flavors floating around. Older code is usually numbered an newer is not...this also depends on who is writing the code. I am not sure if we have a convetion. I tried to stay consistent with the rest of the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. This doesn't need to be addressed in this PR. I'll bring it up on the discussion forum, and maybe raise an issue on the subject. This applies to a lot of functions in Stan.

check_finite("integrate_ode_rk45", "initial state", y0);
check_finite("integrate_ode_rk45", "initial time", t0);
check_finite("integrate_ode_rk45", "times", ts);
check_finite("integrate_ode_rk45", "initial time", t0_dbl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. And yes, this should be faster because we're bringing a smaller object in memory.

@@ -31,7 +32,7 @@ class ops_partials_edge<double, std::vector<var> > {

void dump_partials(double* partials) {
for (int i = 0; i < this->partials_.size(); ++i) {
partials[i] = this->partials_[i];
partials[i] = this->partials_(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe don't use [] on the L.H.S and () on the R.H.S ^^

I think [] is fine for Eigen structures. Since we know i will not be out of bound, we can use the coeff method, which doesn't check whether i is in bound and is therefore slightly faster.


ASSERT_EQ(T, y.size());
for (int t = 0; t < T; t++)
ASSERT_EQ(2, y[t].size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more consistent with other tests in math, shouldn't we use EXPECT_EQ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply this comment to all the unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this, but GitHub still shows this outdated thing. I just double checked and this is aligned to EXPECT_EQ .

std::stringstream msgs;
std::vector<double> x;
std::vector<int> x_int;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this structure for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's being used in a number of tests. We need that as dummy argument to all the observer instances created.

std::vector<int> x_int;
};

TEST_F(StanMathOde, observe_states_dd) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change the name of the test? There are now 4 input that can be passed as dbl or var. So here, a title such as observe_states_dddd may be more appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me have a look...but i would like to avoid having to change all our test names....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest either using the dddd suffix, or something more specific like all_double. I get what you're doing, but some fresh eyes may not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure... let's do this convention in the observer tests.


// here we only test first & last steps, and rely on the
// fact that results in-between affect the initial
// condition of the last step to check their validity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine, though it shouldn't be too difficult to write the test and loop through all the states to make sure the correct answer is always attained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our grad tests for the ode solvers essentially work the same way as this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just a suggestion. Though I wonder if there was an initial motivation for doing this. Maybe cutting down the time of the unit test.

@charlesm93
Copy link
Contributor

Other note: this is an issue that needs to be addressed separately, not in this PR, but the code for operands_and_partials.hpp should be better documented. Do we have spec docs out there that discuss this new functionality?

@wds15
Copy link
Contributor Author

wds15 commented May 9, 2019

The change to operands and partials from me in this pr is really cosmetic. I am using () to index Eigen structures and [] to index std vectors. This is how it should be, i think. If you think there is more doc needed on that, then that should be a new issue.

Thanks for your eyes on it... i will try to address the rest soon.

@charlesm93
Copy link
Contributor

I would leave operands and partials as it is. We can raise another issue and fix the problem. I really think coeff is the way to go here, and I'm happy to write the PR.

In general, I think it might help to make more incremental changes, though, of course, some balance must be found.

@wds15
Copy link
Contributor Author

wds15 commented May 10, 2019

The comment

This is point is not clear to me. What are the -2 / -1 offsetting? Could you point me to documentation on that?

refers to an outdated piece of code. Maybe GitHub does not propagate updates correctly through? In any case... this -2 / -1 thing is gone now.

@wds15
Copy link
Contributor Author

wds15 commented May 10, 2019

@charlesm93 I hope that I have caught all your comments and addressed them. Please be careful when using the GitHub review feature - it seems that it shows you outdated code.

Maybe we can finish this really soon...it looks like we are close.

@charlesm93
Copy link
Contributor

Great. All the comments in my review have been addressed. We're good to merge.

@charlesm93 charlesm93 dismissed syclik’s stale review May 10, 2019 12:56

The review is out-of-date. The comments in this first review have been addressed when @wds15 went through my new review.

@wds15
Copy link
Contributor Author

wds15 commented May 10, 2019

Awesome! Can't wait to hit the "merge" once Jenkins comes back (hopefully without hickups).

Thanks.

(next we need to make the parser aware of this change)

@wds15 wds15 merged commit ec52b8b into develop May 10, 2019
@wds15 wds15 deleted the feature/issue-1071-ode-time-varying branch June 30, 2019 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants