Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start trying to precompile ModelingToolkit better #1215

Merged
merged 4 commits into from
Feb 1, 2022
Merged

Conversation

ChrisRackauckas
Copy link
Member

@ChrisRackauckas ChrisRackauckas commented Aug 18, 2021

using ModelingToolkit, OrdinaryDiffEq

function f()
      @parameters t σ ρ β
      @variables x(t) y(t) z(t)
      D = Differential(t)

      eqs = [D(D(x)) ~ σ*(y-x),
             D(y) ~ x*-z)-y,
             D(z) ~ x*y - β*z]

      @named sys = ODESystem(eqs)
      sys = structural_simplify(sys)

      u0 = [D(x) => 2.0,
            x => 1.0,
            y => 0.0,
            z => 0.0]

      p  ==> 28.0,
            ρ => 10.0,
            β => 8/3]

      tspan = (0.0,100.0)
      prob = ODEProblem(sys,u0,tspan,p,jac=true)
end

using SnoopCompile

tinf = @snoopi_deep f()
Before:
InferenceTimingNode: 8.138765/20.550152 on Core.Compiler.Timings.ROOT() with 821 direct children
InferenceTimingNode: 8.152606/20.643050 on Core.Compiler.Timings.ROOT() with 821 direct children

After:
InferenceTimingNode: 8.216759/17.715817 on Core.Compiler.Timings.ROOT() with 839 direct children
InferenceTimingNode: 8.272943/17.854555 on Core.Compiler.Timings.ROOT() with 840 direct children

only 2 seconds for now, but it's a start.

@ChrisRackauckas
Copy link
Member Author

The symbolic-based libraries are going to be more like the plotting libraries in terms of inference. We'll probably need to dig around for things to @nospecialize.

Though curiously, one of the issues is that the profile doesn't seem to always be based in the library.

Capture

@timholy do you know why that call stack doesn't show what in ModeingToolkit/Symbolics is making the call to broadcast that is then being compiled? A lot of the examples seem "unrooted" here so it's hard to really pin down the sources right now.

@variables x(t) y(t) z(t)
D = Differential(t)

eqs = [D(D(x)) ~ σ*(y-x) + 0.000000000000135,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the trick here: precompiling the runtime generated function and then calling it leads to an error because it's not in the cache, so I just chose an ODE no one would ever solve so we wouldn't hit a hash that's the same. Too hacky? 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. Can we add a lot of underscores and a more "random" number?

Copy link

@timholy timholy Aug 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever! Like all great hacks. I think it gets the job done.

@timholy
Copy link

timholy commented Aug 19, 2021

Yeah, this is more like what I'm accustomed to seeing from packages with significant inference issues. It's almost a Makie-like trace 🙂.

@timholy do you know why that call stack doesn't show what in ModeingToolkit/Symbolics is making the call to broadcast that is then being compiled? A lot of the examples seem "unrooted" here so it's hard to really pin down the sources right now.

First, apologies if I'm repeating stuff you already know.

If a call is dispatched at runtime, it serves as a new root. When that happens, we've set it up so that inference grabs a backtrace() to make it possible to identify the caller (this was a huge step forward, we didn't have that before). The inference_triggers essentially package the caller and the callee together for investigation; ascend(itrig) will give the callee as the first line, and the stacktrace of callers as the remaining lines.

That new root might live in a different module than the caller. And you may have an "ownership" problem (the red bars), where a method is acting on types that it doesn't know about (defined in another package that it doesn't load). The solution is to improve inferrability so that you effectively pick up the flames with red bases and place them atop something you do own---then there's a backedge linking them together, and the whole thing will get put in the precompile cache of a package that knows about both the methods and the types.

The alternative approach is to walk your way up a red-based flame until you get to something that isn't red, and then put a suitable precompile exercise in that package. The one good thing about SnoopCompile.parcel is that it can do that for you (when it's possible at all, which is isn't always). But there are many times where that doesn't work very well, or gets the Int size wrong for 32/64 bit, or makes it fragile across Julia versions. So this is where my advice to just fix inference problems comes to the fore.

There's a demo you can walk through yourself at https://timholy.github.io/SnoopCompile.jl/stable/snoopi_deep_analysis/, essentially a guided-tour exercise in ameliorating inference problems.

Again, this might not really have answered your question, so feel free to keep asking.

@timholy
Copy link

timholy commented Aug 19, 2021

One thing that might be an easy win: for types like

, can you use Vector{T} instead of just Vector? Even Vector{Any} may be better than Vector:

julia> isconcretetype(Vector)
false

julia> isconcretetype(Vector{Any})
true

Items taken out of the vector will not be inferrable, but at least the container itself is inferrable.

@timholy
Copy link

timholy commented Feb 1, 2022

I tried this precompile load against JuliaLang/julia#43990 but got what looks like a compiler error:

ERROR: LoadError: MethodError: Cannot `convert` an object of type Core.GotoIfNot to an object of type Expr
Closest candidates are:
  convert(::Type{T}, ::T) where T at ~/bin/julia-1.7.0/share/julia/base/essentials.jl:218
  Expr(::Any...) at ~/bin/julia-1.7.0/share/julia/base/boot.jl:263
Stacktrace:
  [1] RuntimeGeneratedFunctions.RuntimeGeneratedFunction(cache_tag::Type, context_tag::Type, ex::Expr; opaque_closures::Bool)
    @ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:63
  [2] RuntimeGeneratedFunctions.RuntimeGeneratedFunction(cache_module::Module, context_module::Module, code::Expr; opaque_closures::Bool)
    @ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:85
  [3] #301
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:100 [inlined]
  [4] iterate
    @ ./generator.jl:47 [inlined]
  [5] indexed_iterate(I::Base.Generator{Tuple{Expr, Expr}, ModelingToolkit.var"#301#306"{Module}}, i::Int64)
    @ Base ./tuple.jl:92
  [6] (ODEFunction{true})(sys::ODESystem, dvs::Vector{Any}, ps::Vector{Sym{Real, Base.ImmutableDict{DataType, Any}}}, u0::Vector{Float64}; version::Nothing, tgrad::Bool, jac::Bool, eval_expression::Bool, sparse::Bool, simplify::Bool, eval_module::Module, steady_state::Bool, checkbounds::Bool, sparsity::Bool, kwargs::Base.Pairs{Symbol, Any, NTuple{4, Symbol}, NamedTuple{(:ddvs, :linenumbers, :parallel, :has_difference), Tuple{Nothing, Bool, Symbolics.SerialForm, Bool}}})
    @ ModelingToolkit ~/.julia/dev/ModelingToolkit/src/systems/diffeqs/abstractodesystem.jl:331
  [7] process_DEProblem(constructor::Type, sys::ODESystem, u0map::Vector{Pair{Num, Float64}}, parammap::Vector{Pair{Num, Float64}}; implicit_dae::Bool, du0map::Nothing, version::Nothing, tgrad::Bool, jac::Bool, checkbounds::Bool, sparse::Bool, simplify::Bool, linenumbers::Bool, parallel::Symbolics.SerialForm, eval_expression::Bool, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:has_difference,), Tuple{Bool}}})
    @ ModelingToolkit ~/.julia/dev/ModelingToolkit/src/systems/diffeqs/abstractodesystem.jl:593
  [8] (ODEProblem{true})(sys::ODESystem, u0map::Vector{Pair{Num, Float64}}, tspan::Tuple{Float64, Float64}, parammap::Vector{Pair{Num, Float64}}; callback::Nothing, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:jac,), Tuple{Bool}}})
    @ ModelingToolkit ~/.julia/dev/ModelingToolkit/src/systems/diffeqs/abstractodesystem.jl:671
  [9] #ODEProblem#325
    @ ~/.julia/dev/ModelingToolkit/src/systems/diffeqs/abstractodesystem.jl:649 [inlined]
 [10] top-level scope
    @ ~/src/pctime/MT/tasks_MT.jl:23
 [11] include(fname::String)
    @ Base.MainInclude ./client.jl:451
 [12] top-level scope
    @ REPL[1]:1
in expression starting at /home/tim/src/pctime/MT/tasks_MT.jl:23

This happens even on 1.7. Is there something that needs updating, or is this a Julia bug?

```julia
using ModelingToolkit, OrdinaryDiffEq

function f()
      @parameters t σ ρ β
      @variables x(t) y(t) z(t)
      D = Differential(t)

      eqs = [D(D(x)) ~ σ*(y-x),
             D(y) ~ x*(ρ-z)-y,
             D(z) ~ x*y - β*z]

      @nAmed sys = ODESystem(eqs)
      sys = ode_order_lowering(sys)

      u0 = [D(x) => 2.0,
            x => 1.0,
            y => 0.0,
            z => 0.0]

      p  = [σ => 28.0,
            ρ => 10.0,
            β => 8/3]

      tspan = (0.0,100.0)
      prob = ODEProblem(sys,u0,tspan,p,jac=true)
end

using SnoopCompile

tinf = @snoopi_deep f()
```

```julia
Before:
InferenceTimingNode: 8.138765/20.550152 on Core.Compiler.Timings.ROOT() with 821 direct children
InferenceTimingNode: 8.152606/20.643050 on Core.Compiler.Timings.ROOT() with 821 direct children

After:
InferenceTimingNode: 8.216759/17.715817 on Core.Compiler.Timings.ROOT() with 839 direct children
InferenceTimingNode: 8.272943/17.854555 on Core.Compiler.Timings.ROOT() with 840 direct children
```

only 2 seconds for now, but it's a start.
@ChrisRackauckas
Copy link
Member Author

@timholy this branch is updated now. And oof, we see some regressions haha...

using ModelingToolkit, OrdinaryDiffEq

function f()
      @parameters t σ ρ β
      @variables x(t) y(t) z(t)
      D = Differential(t)

      eqs = [D(D(x)) ~ σ*(y-x),
             D(y) ~ x*-z)-y,
             D(z) ~ x*y - β*z]

      @named sys = ODESystem(eqs)
      sys = structural_simplify(sys)

      u0 = [D(x) => 2.0,
            x => 1.0,
            y => 0.0,
            z => 0.0]

      p  ==> 28.0,
            ρ => 10.0,
            β => 8/3]

      tspan = (0.0,100.0)
      prob = ODEProblem(sys,u0,tspan,p,jac=true)
end

using SnoopCompile

tinf = @snoopi_deep f()

# InferenceTimingNode: 14.376150/29.364275 on Core.Compiler.Timings.ROOT() with 1263 direct children

@timholy
Copy link

timholy commented Feb 1, 2022

Thanks! I posted some numbers in the edited OP of JuliaLang/julia#43990. On that Julia build, it's almost all non-inference time (18s out of 20s). Obviously that would be the next thing to tackle.

Pretty significant increase in load time, unfortunately. I haven't dug into the specifics, it's possible some of it may be fixable.

@ChrisRackauckas
Copy link
Member Author

That PR is still net beneficial here though, so that's good! There's a lot we can do in this library. We're changing the internals to be type-stable etc. so it should see a rather drastic change anyways. With our powers combined it it should get there.

@ChrisRackauckas ChrisRackauckas merged commit 31e131c into master Feb 1, 2022
@ChrisRackauckas ChrisRackauckas deleted the precompile branch February 1, 2022 18:40
@timholy
Copy link

timholy commented Feb 1, 2022

If that PR gets merged it may be important to develop some tooling that can help optimize placement of precompiles across packages. If you have a "base" package A, and B, C, and D each use A and end up recompiling some of the same machinery, that costs you. (That was the origin of JuliaLang/julia#43990 (comment).) If possible it would be better to ensure that A does the precompilation. But aside from seeing the moment of inference itself, we can't easily figure that stuff out now. I may take a second stab at a package I started, which could be subtitled "what's in my *.ji file anyway?" That would allow us to analyze what should go where.

Obviously, native code is another frontier, but that's for another day (well, month).

@ChrisRackauckas
Copy link
Member Author

Obviously, native code is another frontier, but that's for another day (well, month).

I can't wait.

But yeah at least that PR makes precompilation in these kinds of packages work, so we can add precompilation forcing all throughout the symbolics packages which would naturally force less here, etc. But now even before we add it to A we can get usable downstream results, which makes it easier to start getting precompiles strewn around.

sharanry added a commit to sharanry/ModelingToolkit.jl that referenced this pull request Feb 14, 2022
ChrisRackauckas added a commit that referenced this pull request Feb 14, 2022
Rollback #1215 as it causes package compilation to fail
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants