Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate perf/memory regression with components bag #603

Closed
Tracked by #645
gregtatum opened this issue Apr 2, 2021 · 11 comments
Closed
Tracked by #645

Investigate perf/memory regression with components bag #603

gregtatum opened this issue Apr 2, 2021 · 11 comments
Assignees
Labels
A-performance Area: Performance (CPU, Memory) blocked A dependency must be resolved before this is actionable C-datetime Component: datetime, calendars, time zones S-medium Size: Less than a week (larger bug fix or enhancement) T-techdebt Type: ICU4X code health and tech debt
Milestone

Comments

@gregtatum
Copy link
Member

Perf: https://unicode-org.github.io/icu4x-docs/benchmarks/perf/components/datetime/

image

Memory: https://unicode-org.github.io/icu4x-docs/benchmarks/memory/macos-latest/

image

My initial theories are that these examples are using JSON, which incurs a runtime penalty for loading the data providers compared to bincode. I believe this will mostly disappear with bincode support. One possible mitigation would be to move skeletons into their own key.

Here is the breakdown of the memory for the work_log: https://deploy-preview-3128--perf-html.netlify.app/public/37dy2yqktxz7jw6q8zx8eswg63zwy7jkg498kw0/flame-graph/?globalTrackOrder=0&localTrackOrderByPid=1511-0-1-2-3~&thread=0&transforms=mf-1~mf-2&v=5

icu_datetime::provider::gregory::patterns::_::::deserialize 42% 11.2KB

@gregtatum gregtatum added A-performance Area: Performance (CPU, Memory) C-datetime Component: datetime, calendars, time zones T-techdebt Type: ICU4X code health and tech debt labels Apr 2, 2021
@gregtatum gregtatum self-assigned this Apr 2, 2021
@zbraniecki
Copy link
Member

One possible mitigation would be to move skeletons into their own key.

I'd support that. Seems like constructor knows whether we want skeletons or not, so being able to not load them should be a clean architectural division.

@gregtatum
Copy link
Member Author

Looks like the difference in memory is nominal between bincode and json. This supports the case for separate keys.

JSON

    bytes % Difference
main Total 27,494 172%
  Global max 17,348 181%
  End 1,112 100%
pre-skeletons Total 16,011  
  Global max 9,583  
  End 1,112  

Bincode

    bytes % Difference
main Total 22,853 174%
  Global max 17,137 184%
  End 1,112 100%
pre-skeletons Total 13,101  
  Global max 9,305  
  End 1,112  

@gregtatum
Copy link
Member Author

Looks like my theory on bincode is wrong here. I need to figure out how to collect a performance profile to see where time is being spent.

JSON

datetime/overview       time:   [1.4384 ms 1.4497 ms 1.4636 ms]
                        change: [+35.337% +38.009% +40.616%] (p = 0.00 < 0.05)
                        Performance has regressed.

Bincode

datetime/overview       time:   [963.59 us 968.30 us 973.42 us]
                        change: [+49.622% +51.380% +53.124%] (p = 0.00 < 0.05)
                        Performance has regressed.

@gregtatum
Copy link
Member Author

gregtatum commented Apr 6, 2021

I'm experimenting a bit with cargo instruments.

main/bincode

image

main/json

image

@gregtatum
Copy link
Member Author

In the above profile, 60ms is spent in deserializing fields, which is probably where the performance regression is coming in from. This shows that splitting the keys is the fix here. Interestingly icu_datetime::provider::gregory::months is showing as being somewhat expensive to deserialize. Not sure what's going on there. The implementation is a bit obscured through macro magic that I did not untangle.

@gregtatum gregtatum added the S-medium Size: Less than a week (larger bug fix or enhancement) label Apr 7, 2021
@sffc sffc modified the milestones: ICU4X 0.2, 2021-Q2-m1 Apr 9, 2021
@sffc sffc added the blocked A dependency must be resolved before this is actionable label Apr 29, 2021
@sffc
Copy link
Member

sffc commented Apr 29, 2021

Blocked by #257 and #380

@sffc sffc modified the milestones: 2021-Q2-m1, ICU4X 0.3 Apr 29, 2021
@dminor
Copy link
Contributor

dminor commented May 7, 2021

@gregtatum We plan to release 0.3 in about 3 weeks. Do you expect this to be unblocked in time for the 0.3 release, or should we punt to 0.4?

@dminor
Copy link
Contributor

dminor commented May 7, 2021

It looks like #380 is now planned for 0.4, can we remove that as a blocker?

@sffc
Copy link
Member

sffc commented May 7, 2021

Once I finish #257, @gregtatum should investigate if the issues are solved. If not, then we will need to wait for #380.

@gregtatum
Copy link
Member Author

Resolved with #1139.

@gregtatum
Copy link
Member Author

Latest benchmarks
image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-performance Area: Performance (CPU, Memory) blocked A dependency must be resolved before this is actionable C-datetime Component: datetime, calendars, time zones S-medium Size: Less than a week (larger bug fix or enhancement) T-techdebt Type: ICU4X code health and tech debt
Projects
None yet
Development

No branches or pull requests

4 participants