Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FileFormats.MPS] improve performance of MPS writer #2418

Merged
merged 12 commits into from
Feb 6, 2024

Conversation

odow
Copy link
Member

@odow odow commented Feb 4, 2024

Using the Optimizer from #2417

using JuMP
function main(N)
    model = direct_model(Optimizer(MOI.FileFormats.MPS.Model()))
    @variable(model, 0 <= x[i in 1:N] <= 1; integer = isodd(i))
    @constraint(model, [i in 2:N], x[i] + x[i-1] <= 1)
    @objective(model, Max, sum(i * x[i] for i in 1:N))
    MOI.write_to_file(backend(model).inner, "file.mps")
    return
end

Before:

julia> GC.gc(); @time main(100_000);
  2.005077 seconds (18.45 M allocations: 812.105 MiB, 9.98% gc time, 24.17% compilation time: 44% of which was recompilation)

julia> GC.gc(); @time main(100_000);
  1.495755 seconds (17.55 M allocations: 751.348 MiB, 12.09% gc time)

julia> GC.gc(); @time main(100_000);
  1.529073 seconds (17.55 M allocations: 751.348 MiB, 12.28% gc time)

After:

julia> GC.gc(); @time main(100_000);
  1.873707 seconds (11.57 M allocations: 602.868 MiB, 8.18% gc time, 29.04% compilation time: 45% of which was recompilation)

julia> GC.gc(); @time main(100_000);
  1.341565 seconds (10.55 M allocations: 533.148 MiB, 10.52% gc time)

julia> GC.gc(); @time main(100_000);
  1.336950 seconds (10.55 M allocations: 533.148 MiB, 10.17% gc time)

Small improvement in time, but big improvement in the number of allocations.

@odow
Copy link
Member Author

odow commented Feb 4, 2024

Now to

julia> GC.gc(); @time main(100_000);
  1.362330 seconds (10.25 M allocations: 526.213 MiB, 9.37% gc time)

@odow
Copy link
Member Author

odow commented Feb 4, 2024

Now to

julia> GC.gc(); @time main(100_000);
  1.184845 seconds (8.85 M allocations: 475.188 MiB, 11.24% gc time)

@odow
Copy link
Member Author

odow commented Feb 4, 2024

now to

julia> GC.gc(); @time main(100_000);
  1.199502 seconds (7.50 M allocations: 448.509 MiB, 10.95% gc time)

@odow
Copy link
Member Author

odow commented Feb 4, 2024

Now to

julia> GC.gc(); @time main(100_000);
  1.116765 seconds (6.90 M allocations: 436.318 MiB, 11.98% gc time)

@odow
Copy link
Member Author

odow commented Feb 5, 2024

Apart from the unique name stuff, the body of write is now completely type-stable (no red bars):

image

Almost all the time now is turning Int and Float64 to string, variable name lookups, and I/O locks.

@odow odow changed the base branch from master to od/optimizer February 5, 2024 03:19
@odow
Copy link
Member Author

odow commented Feb 5, 2024

Miles' example is now:

(base) oscar@Oscars-MBP /tmp % julia --project=opt ~/Downloads/generate-flow-oscar.jl --output_file mcflow.mps.gz --num_commodities 1000 --num_warehouses 20 --num_stores 200 --seed 1
Parsed args:
  output_file  =>  mcflow.mps.gz
  num_warehouses  =>  20
  num_factories_per_commodity  =>  5
  seed  =>  1
  additional_overtime_cost  =>  0.3
  num_stores  =>  200
  num_commodities  =>  1000
building model ...
model built
 56.593121 seconds (141.45 M allocations: 11.922 GiB, 20.53% gc time, 13.50% compilation time)

Original was

(base) oscar@Oscars-MBP /tmp % julia --project=moi-release ~/Downloads/generate-flow-oscar.jl --output_file mcflow.mps.gz --num_commodities 1000 --num_warehouses 20 --num_stores 200 --seed 1
Parsed args:
  output_file  =>  mcflow.mps.gz
  num_warehouses  =>  20
  num_factories_per_commodity  =>  5
  seed  =>  1
  additional_overtime_cost  =>  0.3
  num_stores  =>  200
  num_commodities  =>  1000
building model ...
model built
 68.109835 seconds (383.89 M allocations: 19.984 GiB, 20.36% gc time, 6.59% compilation time)

So nearly 1/3 the allocations and 60% of the total allocated.

I've been mucking around with storing the indices instead of names, but yet to find something that makes a difference.

@odow
Copy link
Member Author

odow commented Feb 5, 2024

Now

(base) oscar@Oscars-MBP /tmp % julia --project=opt ~/Downloads/generate-flow-oscar.jl --output_file mcflow.mps.gz --num_commodities 1000 --num_warehouses 20 --num_stores 200 --seed 1
Parsed args:
  output_file  =>  mcflow.mps.gz
  num_warehouses  =>  20
  num_factories_per_commodity  =>  5
  seed  =>  1
  additional_overtime_cost  =>  0.3
  num_stores  =>  200
  num_commodities  =>  1000
building model ...
model built
 48.025473 seconds (141.35 M allocations: 11.378 GiB, 15.25% gc time, 15.94% compilation time)

@odow
Copy link
Member Author

odow commented Feb 5, 2024

now

(base) oscar@Oscars-MBP /tmp % julia --project=opt ~/Downloads/generate-flow-oscar.jl --output_file mcflow.mps.gz --num_commodities 1000 --num_warehouses 20 --num_stores 200 --seed 1
Parsed args:
  output_file  =>  mcflow.mps.gz
  num_warehouses  =>  20
  num_factories_per_commodity  =>  5
  seed  =>  1
  additional_overtime_cost  =>  0.3
  num_stores  =>  200
  num_commodities  =>  1000
building model ...
model built
 41.327799 seconds (138.67 M allocations: 11.201 GiB, 15.64% gc time, 10.55% compilation time)

@odow
Copy link
Member Author

odow commented Feb 5, 2024

I tried replacing the row_name with an index, but it didn't help things. Only one copy of the name is stored in coefficients because the strings must be stored by reference.

@mlubin
Copy link
Member

mlubin commented Feb 6, 2024

Nice improvements!

@odow
Copy link
Member Author

odow commented Feb 6, 2024 via email

@odow
Copy link
Member Author

odow commented Feb 6, 2024 via email

Copy link
Member

@mlubin mlubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (didn't review the code in detail)

@odow
Copy link
Member Author

odow commented Feb 6, 2024

Merging this as an improvement for now. But I have a plan for a larger change.

@odow odow merged commit 959f9b5 into od/optimizer Feb 6, 2024
14 of 17 checks passed
@odow odow deleted the od/mps-improvements branch February 6, 2024 22:09
@blegat
Copy link
Member

blegat commented Feb 7, 2024

To do better we'd need a much larger rewrite that stored the problem in
column order to start with instead of the current constraint based
approach. You probably still have a 2X memory penalty as we transpose the
matrix.

This sounds related to what Utilities.MatrixOfConstraints and Utilities.MutableSparseMatrixCSC are trying to resolve. See my talk at the 2021 INFORMS Annual Meeting (the slides are on my website).

@odow
Copy link
Member Author

odow commented Feb 7, 2024

MatrixOfConstraints can't incrementally build a CSC though right? It just does it in a copy_to.

The main issue with the current design is that when writing, you can the CSC, and when reading, you want the CSR. At the moment, we use a single data structure, but it'd be nice to switch between them depending on whether you're reading or writing.

@blegat
Copy link
Member

blegat commented Feb 8, 2024

No, it's done incrementally. Utilities.MutableSparseMatrixCSC receives the CSR form (because you receive constraint by constraint so row by row) and but it's actually building a CSC form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants