Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable manual shrinkage of Dict #45004

Merged
merged 3 commits into from
May 3, 2022
Merged

Conversation

petvana
Copy link
Member

@petvana petvana commented Apr 16, 2022

Based on the discussion in #44778, there seems to be no reason to disable manual shrinkage of Dict by calling sizehint! any more. The only counterintuitive side effect (I'm aware of) is that merging two sparse Dicts may lead to a single denser one because of

julia/base/abstractdict.jl

Lines 218 to 222 in 6106e6c

function merge!(d::AbstractDict, others::AbstractDict...)
for other in others
if haslength(d) && haslength(other)
sizehint!(d, length(d) + length(other))
end

@petvana petvana changed the title Enable shrinkage of Dict Enable manual shrinkage of Dict Apr 16, 2022
return d
end
rehash!(d, newsz)
newsz = _tablesz(cld(3 * newsz, 2))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_tablesz seems necessary not to rehash for newsz leading the to same 2^x size.

@petvana petvana marked this pull request as ready for review April 17, 2022 07:27
@petvana petvana added collections Data structures holding multiple items, e.g. sets labels Apr 18, 2022
@petvana petvana requested a review from vtjnash April 23, 2022 17:26
end
rehash!(d, newsz)
newsz = _tablesz(cld(3 * newsz, 2))
return newsz == oldsz ? d : rehash!(d, newsz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worthwhile to make this a ranged check for newsz > oldsz * 2 || newsz < oldsz \div 2, So we only resize if it is significantly different from the size it already is? Otherwise, LGTM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be significantly different since it goes through _tablesz now (rounded to 2^x), as I highlighted in the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, now I understand the comment better. The _tablesz means it is a power-of-2, so it is either exact, or it is different by a least a power of 2.

@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Apr 29, 2022
@DilumAluthge DilumAluthge merged commit 8c61f40 into JuliaLang:master May 3, 2022
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label May 3, 2022
@aviatesk
Copy link
Member

aviatesk commented May 9, 2022

Bisected this commit seems to have introduced the regression within collection benchmark suite. See 8c61f40#commitcomment-73182869 for the regressed examples.

DilumAluthge added a commit that referenced this pull request May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants