Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepcopy prob.noise #496

Merged
merged 18 commits into from
Sep 17, 2022
Merged

deepcopy prob.noise #496

merged 18 commits into from
Sep 17, 2022

Conversation

rmsrosa
Copy link
Contributor

@rmsrosa rmsrosa commented Aug 18, 2022

This fixes #412 .

I noticed deepcopy(prob.noise) is also used at https://github.com/SciML/DiffEqNoiseProcess.jl/blob/master/src/solve.jl#L7

@ChrisRackauckas
Copy link
Member

If you do this, then you cannot solve with the same noise. That would be a bug. It's documented behavior that if you choose a noise and you solve with it, you solve with that noise. If you choose a noise and solve with it twice, you get the solution with that noise twice.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Aug 20, 2022

  1. Currently, solve mutates a noise process Z given in prob = SDEProblem(..., noise = Z). I wouldn't expect that (solve does not have an exclamation point at the end).

  2. Even so, it does not generate the same path. Internally, the noise is set to W = prob.noise, which is set to Z, and then W gets emptied before being filled with a new run. Only if the same seed and the same time step are used is that the same noise path is (re)generated. Otherwise, we will get a new noise path W and hence a different Z. Even if we use the same seed but with a different time step, the path won't be the same, because bridge is not used.

  3. As I see it currently, the right way to get the same noise path is to use NoiseWrapper, as documented. If we don't use that, I assume we use the same noise process, but not necessarily the same sample path.

Either way, I see a bug to fix. But my opinion is that it is reasonable not to mutate Z given in noise = Z (hence use W = deepcopy(prob.noise) internally) and assume that in this case we use the same noise process but a different sample path. If one wants to use the same sample path, then one should use NoiseWrapper. If, however, you really want the same noise path from noise = Z, then internally one should not empty W and check whether W has more than the starting value and, in that case, use bridge to generate the noise at any new time instant. In that case, I don't know what is the point of NoiseWrapper.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Aug 23, 2022

I don't see any error in CI / test (Interface, 1). Maybe it timed out?

As for Interface2 failing in 1 and 1.6, yeah, with deepcopy(prob.noise), reset=false has no effect and https://github.com/SciML/StochasticDiffEq.jl/blob/master/test/scalar_noise.jl#L20 fails, with sol being equal to solve(prob,SRI()).

@rmsrosa rmsrosa marked this pull request as draft August 24, 2022 13:17
src/solve.jl Outdated Show resolved Hide resolved
@rmsrosa
Copy link
Contributor Author

rmsrosa commented Aug 28, 2022

Ok, this is the best I could think of. I hope deepcopying just integrator.sol.W, instead of the whole integrator or integrator.sol is a reasonable solution. I will leave it like that.

@rmsrosa rmsrosa marked this pull request as ready for review August 28, 2022 18:55
@ChrisRackauckas
Copy link
Member

We should add a keyword argument to the interface for alias_noise to turn this off and return to the previous behavior. Also, the noise tutorials which rely on this need to get updated.

It would be good to instead define copy overloads for the NoiseProcess types though, and directly use those. Deepcopy is usually very heavyhanded.

When true, the default, it deepcopies the noise from and back to prob
@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 13, 2022

Good idea about the keyword argument alias_noise. I just implemented it, but I still plan to add some tests.

As for overloading copy for the noise types, it is not clear to me yet how to do it properly, because the types may be nested structs.

@ChrisRackauckas
Copy link
Member

Then let's go with deepcopy for now, and open an issue in DiffEqNoiseProcess about implementing a copy method.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 13, 2022

Oh, I need to add alias_noise to the list of allowedkeywords over at DiffEqBase.jl, right? I'll do that.

@ChrisRackauckas
Copy link
Member

Yes

src/solve.jl Outdated Show resolved Hide resolved
@rmsrosa rmsrosa marked this pull request as ready for review September 15, 2022 20:54
@ChrisRackauckas
Copy link
Member

Not sure about that test failure.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

The error in the Interface 2 test might have been "bad luck". The test file tau_leaping.jl doesn't reset the seed so the error Evaluated: 729.43435 ≈ 721.9787 (rtol=0.01) was probably just a matter of chance.

As for the errors in Interface 1 (1.0 and 1.6), I believe they have been happening since before I first started to contribute. The error report doesn't say much. In my machine, those tests just hang.

@ChrisRackauckas
Copy link
Member

Interface 2 was just bad luck: there are a few random seeds in there that can cause that.

Interface 1, I don't know of that issue from before? Spawn null PR and see if master fails with that. If it does, we need to track that down.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

Indeed, it was not there. I probably got confused. Looking at the recent commit history, the only other place I saw the same error was on an innocent Project update commit on Aug 24 and the error was only on Interface 1, not 1.6. Could have been just bad luck too?

@ChrisRackauckas
Copy link
Member

This kind of kill seems scary. Retry? One way it could happen is if the CI machine goes OOM. Since the CI machines are shared with many other projects (Github actions just gives you a few cores), another project can cause the machine to OOM, so it can randomly happen (with a higher likelihood the more memory you take).

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

Yeah, I didn't know the details, but that is what I imagined, that the system would hang for some reason. Retrying is a good idea. And I imagined the codecov decline is due to some tests failing and not covering what it used to?

@rmsrosa rmsrosa marked this pull request as draft September 16, 2022 12:49
@rmsrosa rmsrosa marked this pull request as ready for review September 16, 2022 12:49
@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

Do I need to make a silly commit to retrigger it or do you have other means?

@ChrisRackauckas
Copy link
Member

Fix something in the README 😅

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

No broken link in README so I edited some random things 😅

That same error always takes forever on my machine and I just usually kill it. Probably the CI choke at it this time as well. Hopefully it goes thru this time.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

Hmm, that is suspicious. I ran master locally and it all went fine. But it is hanging on this branch. Maybe GC is not collecting all the copies in that sequence of solves.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 16, 2022

Yup, I used alias_noise=false in the reversal_test.jl solves and the tests were successful. Edit: local tests on my computer, I mean.

@ChrisRackauckas
Copy link
Member

Force a GC call somewhere in the tests?

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 17, 2022

GC did it! Tests passed locally. Let's wait for the CI.

@rmsrosa
Copy link
Contributor Author

rmsrosa commented Sep 17, 2022

Ok, I had to move the GC inside the inner loop, but that worked, at least.

@ChrisRackauckas ChrisRackauckas merged commit 801861d into SciML:master Sep 17, 2022
@rmsrosa rmsrosa deleted the noise_hygiene branch September 17, 2022 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Same driving Wiener noise is displayed/stored but different trajectories are simulated
3 participants