Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin Futures in #[tokio::test] to stack #5205

Merged
merged 3 commits into from
Nov 29, 2022
Merged

Pin Futures in #[tokio::test] to stack #5205

merged 3 commits into from
Nov 29, 2022

Conversation

Thomasdezeeuw
Copy link
Contributor

Motivation

This reduces the amount of copies of the Runtime::block_on and related functions the compiler has to generate and LLVM process. We've seen it reduce the compilation time of our tests (some 1900 of them) from 40s down to 12s, with no impact on the runtime of tests.

Below is an output of llvm-lines for our tests.

Before:

  Lines                  Copies                Function name
  -----                  ------                -------------
  8954414                156577                (TOTAL)
   984626 (11.0%, 11.0%)   9289 (5.9%,  5.9%)  std::thread::local::LocalKey<T>::try_with
   648093 (7.2%, 18.2%)    1857 (1.2%,  7.1%)  tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
   557100 (6.2%, 24.5%)    3714 (2.4%,  9.5%)  tokio::park::thread::CachedParkThread::block_on
   551679 (6.2%, 30.6%)    7430 (4.7%, 14.2%)  tokio::coop::with_budget::{{closure}}
   514389 (5.7%, 36.4%)    3714 (2.4%, 16.6%)  tokio::runtime::scheduler::current_thread::Context::enter
   326832 (3.6%, 40.0%)    1857 (1.2%, 17.8%)  tokio::runtime::scheduler::current_thread::CurrentThread::block_on
   291549 (3.3%, 43.3%)    1857 (1.2%, 19.0%)  tokio::runtime::scheduler::current_thread::CoreGuard::enter
   261907 (2.9%, 46.2%)    7430 (4.7%, 23.7%)  tokio::coop::budget
   189468 (2.1%, 48.3%)    7430 (4.7%, 28.5%)  tokio::coop::with_budget
   137418 (1.5%, 49.8%)    3714 (2.4%, 30.8%)  tokio::runtime::enter::Enter::block_on
   126276 (1.4%, 51.3%)    1857 (1.2%, 32.0%)  tokio::runtime::Runtime::block_on
   124419 (1.4%, 52.6%)    1857 (1.2%, 33.2%)  tokio::macros::scoped_tls::ScopedKey<T>::set
   118897 (1.3%, 54.0%)    3715 (2.4%, 35.6%)  core::option::Option<T>::or_else
   111420 (1.2%, 55.2%)    1857 (1.2%, 36.8%)  tokio::runtime::scheduler::current_thread::CurrentThread::block_on::{{closure}}
   109408 (1.2%, 56.4%)    2105 (1.3%, 38.1%)  <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   105893 (1.2%, 57.6%)    9289 (5.9%, 44.0%)  std::thread::local::LocalKey<T>::with
    96564 (1.1%, 58.7%)    1857 (1.2%, 45.2%)  tokio::runtime::scheduler::current_thread::Context::run_task
    90993 (1.0%, 59.7%)    7428 (4.7%, 50.0%)  tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
    90515 (1.0%, 60.7%)    2105 (1.3%, 51.3%)  core::pin::Pin<&mut T>::map_unchecked_mut
    89136 (1.0%, 61.7%)    1857 (1.2%, 52.5%)  tokio::runtime::scheduler::multi_thread::MultiThread::block_on

After:

  Lines                  Copies               Function name
  -----                  ------               -------------
  3188618                41634                (TOTAL)
   109408 (3.4%,  3.4%)   2105 (5.1%,  5.1%)  <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
    90515 (2.8%,  6.3%)   2105 (5.1%, 10.1%)  core::pin::Pin<&mut T>::map_unchecked_mut
    56220 (1.8%,  8.0%)   1874 (4.5%, 14.6%)  alloc::boxed::Box<T>::pin
    48333 (1.5%,  9.5%)   2179 (5.2%, 19.8%)  core::ops::function::FnOnce::call_once
    28587 (0.9%, 10.4%)      1 (0.0%, 19.8%)  XXXXXXXXXXXXXXXXXXX
    18730 (0.6%, 11.0%)   1873 (4.5%, 24.3%)  alloc::boxed::Box<T,A>::into_pin
    16190 (0.5%, 11.5%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    15870 (0.5%, 12.0%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    15250 (0.5%, 12.5%)      1 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXX
    12801 (0.4%, 12.9%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12801 (0.4%, 13.3%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12630 (0.4%, 13.7%)   2105 (5.1%, 29.4%)  <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::{{closure}}
    12613 (0.4%, 14.1%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12613 (0.4%, 14.5%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12613 (0.4%, 14.9%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12613 (0.4%, 15.3%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    11395 (0.4%, 15.7%)     96 (0.2%, 29.7%)  alloc::alloc::box_free
    11364 (0.4%, 16.0%)   1891 (4.5%, 34.2%)  <T as core::convert::Into<U>>::into
    11238 (0.4%, 16.4%)   1873 (4.5%, 38.7%)  alloc::boxed::<impl core::convert::From<alloc::boxed::Box<T,A>> for core::pin::Pin<alloc::boxed::Box<T,A>>>::from
    10735 (0.3%, 16.7%)      2 (0.0%, 38.7%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Note that I have replaced our test functions with XXX. As you can clearly see they're not in the top 20 in the before output, while they're in the after oput.

Further note that the amount of copies have been reduced from 156577 to 41634.

@Thomasdezeeuw

This comment was marked as resolved.

@Thomasdezeeuw
Copy link
Contributor Author

Per @Darksonn suggestion I also tried pinning the future to the stack in Runtime::block_on, but that didn't reduce the amount of llvm lines generated. Relevant commit is https://github.com/Thomasdezeeuw/tokio/commit/1aa04f449ada5f0cbc39bbaedfb4fd4f97cd7b95.

@Thomasdezeeuw
Copy link
Contributor Author

We could also try pinning the future to the stack and using Pin<&mut dyn Future>, but that might make it easier to blow the stack.

@Darksonn
Copy link
Contributor

It wouldn't make it any easier to blow the stack than it is today.

@Thomasdezeeuw
Copy link
Contributor Author

It wouldn't make it any easier to blow the stack than it is today.

I though that Tokio boxed futures before running them, or am I mistaken in that?

@Darksonn
Copy link
Contributor

It doesn't for block_on.

Thomas de Zeeuw added 2 commits November 23, 2022 16:51
This reduces the amount of copies of the Runtime::block_on and related
functions the compiler has to generate and LLVM process. We've seen it
reduce the compilation time of our tests (some 1900 of them) from 40s
down to 12s, with no impact on the runtime of tests.

Below is an output of llvm-lines for our tests.

Before:

  Lines                  Copies                Function name
  -----                  ------                -------------
  8954414                156577                (TOTAL)
   984626 (11.0%, 11.0%)   9289 (5.9%,  5.9%)  std::thread::local::LocalKey<T>::try_with
   648093 (7.2%, 18.2%)    1857 (1.2%,  7.1%)  tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
   557100 (6.2%, 24.5%)    3714 (2.4%,  9.5%)  tokio::park::thread::CachedParkThread::block_on
   551679 (6.2%, 30.6%)    7430 (4.7%, 14.2%)  tokio::coop::with_budget::{{closure}}
   514389 (5.7%, 36.4%)    3714 (2.4%, 16.6%)  tokio::runtime::scheduler::current_thread::Context::enter
   326832 (3.6%, 40.0%)    1857 (1.2%, 17.8%)  tokio::runtime::scheduler::current_thread::CurrentThread::block_on
   291549 (3.3%, 43.3%)    1857 (1.2%, 19.0%)  tokio::runtime::scheduler::current_thread::CoreGuard::enter
   261907 (2.9%, 46.2%)    7430 (4.7%, 23.7%)  tokio::coop::budget
   189468 (2.1%, 48.3%)    7430 (4.7%, 28.5%)  tokio::coop::with_budget
   137418 (1.5%, 49.8%)    3714 (2.4%, 30.8%)  tokio::runtime::enter::Enter::block_on
   126276 (1.4%, 51.3%)    1857 (1.2%, 32.0%)  tokio::runtime::Runtime::block_on
   124419 (1.4%, 52.6%)    1857 (1.2%, 33.2%)  tokio::macros::scoped_tls::ScopedKey<T>::set
   118897 (1.3%, 54.0%)    3715 (2.4%, 35.6%)  core::option::Option<T>::or_else
   111420 (1.2%, 55.2%)    1857 (1.2%, 36.8%)  tokio::runtime::scheduler::current_thread::CurrentThread::block_on::{{closure}}
   109408 (1.2%, 56.4%)    2105 (1.3%, 38.1%)  <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   105893 (1.2%, 57.6%)    9289 (5.9%, 44.0%)  std::thread::local::LocalKey<T>::with
    96564 (1.1%, 58.7%)    1857 (1.2%, 45.2%)  tokio::runtime::scheduler::current_thread::Context::run_task
    90993 (1.0%, 59.7%)    7428 (4.7%, 50.0%)  tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
    90515 (1.0%, 60.7%)    2105 (1.3%, 51.3%)  core::pin::Pin<&mut T>::map_unchecked_mut
    89136 (1.0%, 61.7%)    1857 (1.2%, 52.5%)  tokio::runtime::scheduler::multi_thread::MultiThread::block_on

After:

  Lines                  Copies               Function name
  -----                  ------               -------------
  3188618                41634                (TOTAL)
   109408 (3.4%,  3.4%)   2105 (5.1%,  5.1%)  <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
    90515 (2.8%,  6.3%)   2105 (5.1%, 10.1%)  core::pin::Pin<&mut T>::map_unchecked_mut
    56220 (1.8%,  8.0%)   1874 (4.5%, 14.6%)  alloc::boxed::Box<T>::pin
    48333 (1.5%,  9.5%)   2179 (5.2%, 19.8%)  core::ops::function::FnOnce::call_once
    28587 (0.9%, 10.4%)      1 (0.0%, 19.8%)  XXXXXXXXXXXXXXXXXXX
    18730 (0.6%, 11.0%)   1873 (4.5%, 24.3%)  alloc::boxed::Box<T,A>::into_pin
    16190 (0.5%, 11.5%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    15870 (0.5%, 12.0%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    15250 (0.5%, 12.5%)      1 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXX
    12801 (0.4%, 12.9%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12801 (0.4%, 13.3%)      2 (0.0%, 24.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12630 (0.4%, 13.7%)   2105 (5.1%, 29.4%)  <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::{{closure}}
    12613 (0.4%, 14.1%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12613 (0.4%, 14.5%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12613 (0.4%, 14.9%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    12613 (0.4%, 15.3%)      2 (0.0%, 29.4%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    11395 (0.4%, 15.7%)     96 (0.2%, 29.7%)  alloc::alloc::box_free
    11364 (0.4%, 16.0%)   1891 (4.5%, 34.2%)  <T as core::convert::Into<U>>::into
    11238 (0.4%, 16.4%)   1873 (4.5%, 38.7%)  alloc::boxed::<impl core::convert::From<alloc::boxed::Box<T,A>> for core::pin::Pin<alloc::boxed::Box<T,A>>>::from
    10735 (0.3%, 16.7%)      2 (0.0%, 38.7%)  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Note that I have replaced our test functions with XXX. As you can
clearly see they're not in the top 20 in the before output, while
they're in the after oput.

Further note that the amount of copies have been reduced from 156577 to
41634.
@Thomasdezeeuw
Copy link
Contributor Author

@Darksonn I've rebased on master and changed to pin the future to stack, it has the same results.

@Thomasdezeeuw Thomasdezeeuw changed the title Box Futures in #[tokio::test] Pin Futures in #[tokio::test] to stack Nov 23, 2022
@Thomasdezeeuw
Copy link
Contributor Author

@Darksonn do you have time to review this or should I ask some one else?

@Darksonn
Copy link
Contributor

Please ask someone else. All of the things I have to do has its deadline this or next week.

@Thomasdezeeuw Thomasdezeeuw requested review from a team and removed request for Darksonn November 29, 2022 09:47
tokio-macros/src/entry.rs Outdated Show resolved Hide resolved
tokio-macros/src/entry.rs Outdated Show resolved Hide resolved
tokio-macros/src/entry.rs Show resolved Hide resolved
Instead of ::tokio.
@Thomasdezeeuw Thomasdezeeuw enabled auto-merge (rebase) November 29, 2022 14:11
@Thomasdezeeuw Thomasdezeeuw merged commit f5686f6 into tokio-rs:master Nov 29, 2022
@Thomasdezeeuw Thomasdezeeuw deleted the box-test-future branch November 29, 2022 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants