Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor performance improvements for close!(::MixedDofHandler) #637

Merged
merged 1 commit into from
Mar 24, 2023

Conversation

fredrikekre
Copy link
Member

This patch tweaks the dof distribution for MixedDofHandler slightly:

  • push! cell dofs directly into dh.cell_dofs instead of first pushing to an element celldofs vector and then pushing to the global one. This removes one append! per cell, but does not affect the benchmark that much (dict lookup is still dominating).
  • Use the computed dict token for extraction in get_or_create_dofs!
  • Change keyword arguments in get_or_create_dofs! to positional. This does not matter for performance, but gives nicer profile stackframes.

I used this benchmark code:

using Ferrite
using BenchmarkTools
function dofhandler(::Type{DHT}, grid) where DHT
    dh = DHT(grid)
    add!(dh, :v, 2, Lagrange{2,RefCube,2}())
    add!(dh, :s, 1, Lagrange{2,RefCube,1}())
    close!(dh)
    return dh
end
grid = generate_grid(Quadrilateral, (1000, 1000))
@btime dofhandler(MixedDofHandler, $grid)

with the following (disappointing) results:

1.605 s (319 allocations: 640.42 MiB) # master
1.573 s (316 allocations: 639.34 MiB) # patch

For reference, DofHandler gives 1.3s, 405MiB for this benchmark.

This patch tweaks the dof distribution for `MixedDofHandler` slightly:
 - `push!` cell dofs directly into `dh.cell_dofs` instead of first
   pushing to an element celldofs vector and then pushing to the global
   one. This removes one `append!` per cell, but does not affect the
   benchmark that much (dict lookup is still dominating).
 - Use the computed dict token for extraction in `get_or_create_dofs!`
 - Change keyword arguments in `get_or_create_dofs!` to positional. This
   does not matter for performance, but gives nicer profile stackframes.

I used this benchmark code:

```julia
using Ferrite
using BenchmarkTools
function dofhandler(::Type{DHT}, grid) where DHT
    dh = DHT(grid)
    add!(dh, :v, 2, Lagrange{2,RefCube,2}())
    add!(dh, :s, 1, Lagrange{2,RefCube,1}())
    close!(dh)
    return dh
end
grid = generate_grid(Quadrilateral, (1000, 1000))
@Btime dofhandler(MixedDofHandler, $grid)
```

with the following (disappointing) results:

```
1.605 s (319 allocations: 640.42 MiB) # master
1.573 s (316 allocations: 639.34 MiB) # patch
```

For reference, `DofHandler` gives 1.3s, 405MiB for this benchmark.
@codecov-commenter
Copy link

codecov-commenter commented Mar 24, 2023

Codecov Report

Patch coverage: 86.66% and project coverage change: -0.01 ⚠️

Comparison is base (4b330ce) 92.41% compared to head (b4ef9f6) 92.41%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #637      +/-   ##
==========================================
- Coverage   92.41%   92.41%   -0.01%     
==========================================
  Files          29       29              
  Lines        4432     4429       -3     
==========================================
- Hits         4096     4093       -3     
  Misses        336      336              
Impacted Files Coverage Δ
src/Dofs/MixedDofHandler.jl 79.18% <86.66%> (-0.28%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@fredrikekre fredrikekre merged commit 43a1c19 into master Mar 24, 2023
@fredrikekre fredrikekre deleted the fe/mdh-close branch March 24, 2023 14:41
fredrikekre added a commit that referenced this pull request Mar 24, 2023
This patch removes the tracking of (internal) cell dofs when
distributing dofs for MixedDofHandler. There is no need to track these
since they are defined as dofs which are never shared with other
elements.

This almost closes the performance gap between the DofHandlers
(benchmark code from #637):

```
1.573 s (316 allocations: 639.34 MiB) # MixedDofHandler master
1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler patch
1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch
```
fredrikekre added a commit that referenced this pull request Mar 24, 2023
#639)

This patch removes the tracking of (internal) cell dofs when
distributing dofs for MixedDofHandler. There is no need to track these
since they are defined as dofs which are never shared with other
elements.

This almost closes the performance gap between the DofHandlers
(benchmark code from #637):

```
1.573 s (316 allocations: 639.34 MiB) # MixedDofHandler master
1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler patch
1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch
```
fredrikekre added a commit that referenced this pull request Mar 24, 2023
This reduces memory used in dof distribution for MixedDofHandler by only
storing the first dof that are added for every entity, instead of the
range of dofs. This simply copies the logic from DofHandler, which also
means that MixedDofHandler now also support multiple dofs per face in
2D, but not in 3D, just like DofHandler.

This closes the performance gap between the DofHandlers (benchmark code
from #637):

```
1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler master
1.220 s (249 allocations: 456.05 MiB) # MixedDofHandler patch
1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch
```
fredrikekre added a commit that referenced this pull request Mar 25, 2023
This reduces memory used in dof distribution for MixedDofHandler by only
storing the first dof that are added for every entity, instead of the
range of dofs. This simply copies the logic from DofHandler, which also
means that MixedDofHandler now also support multiple dofs per face in
2D, but not in 3D, just like DofHandler.

This closes the performance gap between the DofHandlers (benchmark code
from #637):

```
1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler master
1.220 s (249 allocations: 456.05 MiB) # MixedDofHandler patch
1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch
```
fredrikekre added a commit that referenced this pull request Mar 25, 2023
This reduces memory used in dof distribution for MixedDofHandler by only
storing the first dof that are added for every entity, instead of the
range of dofs. This simply copies the logic from DofHandler, which also
means that MixedDofHandler now also support multiple dofs per face in
2D, but not in 3D, just like DofHandler.

This closes the performance gap between the DofHandlers (benchmark code
from #637):

```
1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler master
1.220 s (249 allocations: 456.05 MiB) # MixedDofHandler patch
1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch
```
fredrikekre added a commit that referenced this pull request Mar 25, 2023
This reduces memory used in dof distribution for MixedDofHandler by only
storing the first dof that are added for every entity, instead of the
range of dofs. This simply copies the logic from DofHandler, which also
means that MixedDofHandler now also support multiple dofs per face in
2D, but not in 3D, just like DofHandler.

This closes the performance gap between the DofHandlers (benchmark code
from #637):

```
1.397 s (257 allocations: 543.50 MiB) # MixedDofHandler master
1.220 s (249 allocations: 456.05 MiB) # MixedDofHandler patch
1.267 s (234 allocations: 405.12 MiB) # DofHandler master/patch
```
fredrikekre added a commit that referenced this pull request Mar 25, 2023
This patch changes the storage for vertex dof tracking from dictionaries
to vectors. This is possible since we know already from the start the
number of vertices and their global numbers. Since dof distribution is
almost 100% dict hashing this give a pretty big performance boost.

Running the benchmark code from #637:

```
1.267 s (234 allocations: 405.12 MiB)    # DofHandler master
812.388 ms (130 allocations: 290.07 MiB) # DofHandler patch
1.220 s (249 allocations: 456.05 MiB)    # MixedDofHandler master
838.601 ms (145 allocations: 341.00 MiB) # MixedDofHandler patch
```

Note that the same optimization can be done for edges/faces too if those
were globally enumerated. Fortunately vertex dofs are the most common
case so implementing this is not a high priority.
fredrikekre added a commit that referenced this pull request Mar 25, 2023
This patch changes the storage for vertex dof tracking from dictionaries
to vectors. This is possible since we know already from the start the
number of vertices and their global numbers. Since dof distribution is
almost 100% dict hashing this give a pretty big performance boost.

Running the benchmark code from #637:

```
1.267 s (234 allocations: 405.12 MiB)    # DofHandler master
812.388 ms (130 allocations: 290.07 MiB) # DofHandler patch
1.220 s (249 allocations: 456.05 MiB)    # MixedDofHandler master
838.601 ms (145 allocations: 341.00 MiB) # MixedDofHandler patch
```

Note that the same optimization can be done for edges/faces too if those
were globally enumerated. Fortunately vertex dofs are the most common
case so implementing this is not a high priority.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants