Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refine auto_growth allocator #35732

Merged
merged 3 commits into from
Oct 11, 2021
Merged

Conversation

zhiqiu
Copy link
Contributor

@zhiqiu zhiqiu commented Sep 14, 2021

PR types

Performance optimization

PR changes

Others

Describe

refine auto_growth allocator

CUDA does address alignment itself, so in most case we do not need to manually perform address alignment.
refer: https://stackoverflow.com/questions/14082964/cuda-alignment-256bytes-seriously

The problem is that, the original implementation may alloc more 256 bytes due to AlignedAllocator and the 256 bytes may put into free blocks. Which may introduce more memory fragmentation and increase the time to find fit block when allocing new allocation.

  • before
    image

  • after
    image

The max number of free blocks in standalone_executor_test(single thread, no gc version, for consistance) decreases: 200->185

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Collaborator

@sneaxiy sneaxiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu merged commit 6d353aa into PaddlePaddle:develop Oct 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants