Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve _check_key() and _store_cmd() performance #183

Merged
merged 2 commits into from
Aug 31, 2018

Conversation

shargan
Copy link
Contributor

@shargan shargan commented Aug 30, 2018

While making the batching changes in #182 I noticed a few performance hot-spots.

  1. _check_key() is called at least once on every single operation and was doing a lot of repeat or just inefficient work.
  2. _store_cmd() was calculating and encoding extra and expire for every key/value pair even though they don't change (I didn't move this in the previous PR because I didn't want to muddy my intentions).

The first change is responsible for most of the performance increase, it affects every method, and the difference scales with key length. The second only impacts set_many() but in my 100-key tests was responsible for a decent 7.5% gain.

Benchmarks from master with irrelevant lines trimmed:

# python2.7
$ pytest --verbose --capture=no --no-cov -m benchmark pymemcache/test --keys 100 --count 10000
================================================ test session starts =================================================
platform darwin -- Python 2.7.15, pytest-3.7.4, py-1.6.0, pluggy-0.7.1 -- /Users/shargan/repos/pymemcache/.venv/bin/python

pymemcache/test/test_benchmark.py::test_bench_get[pymemcache] 2.37730312347
pymemcache/test/test_benchmark.py::test_bench_set[pymemcache] 0.353810071945
pymemcache/test/test_benchmark.py::test_bench_get_multi[pymemcache] 16.3643598557
pymemcache/test/test_benchmark.py::test_bench_set_multi[pymemcache] 10.8592970371


# python3.6
$ pytest --verbose --capture=no --no-cov -m benchmark pymemcache/test --keys 100 --count 10000
================================================ test session starts =================================================
platform darwin -- Python 3.6.6, pytest-3.7.4, py-1.6.0, pluggy-0.7.1 -- /Users/shargan/repos/pymemcache/.venv36/bin/python3.6

pymemcache/test/test_benchmark.py::test_bench_get[pymemcache] 3.1345767974853516
pymemcache/test/test_benchmark.py::test_bench_set[pymemcache] 0.3885829448699951
pymemcache/test/test_benchmark.py::test_bench_get_multi[pymemcache] 18.19443702697754
pymemcache/test/test_benchmark.py::test_bench_set_multi[pymemcache] 11.366436004638672

With this diff:

# python2.7
$ pytest --verbose --capture=no --no-cov -m benchmark pymemcache/test --keys 100 --count 10000
================================================ test session starts =================================================
platform darwin -- Python 2.7.15, pytest-3.7.4, py-1.6.0, pluggy-0.7.1 -- /Users/shargan/repos/pymemcache/.venv/bin/python

pymemcache/test/test_benchmark.py::test_bench_get[pymemcache] 2.65929102898
pymemcache/test/test_benchmark.py::test_bench_set[pymemcache] 0.332684993744
pymemcache/test/test_benchmark.py::test_bench_get_multi[pymemcache] 12.0730199814
pymemcache/test/test_benchmark.py::test_bench_set_multi[pymemcache] 5.42873597145


# python3.6
$ pytest --verbose --capture=no --no-cov -m benchmark pymemcache/test --keys 100 --count 10000
================================================ test session starts =================================================
platform darwin -- Python 3.6.6, pytest-3.7.4, py-1.6.0, pluggy-0.7.1 -- /Users/shargan/repos/pymemcache/.venv36/bin/python3.6

pymemcache/test/test_benchmark.py::test_bench_get[pymemcache] 2.7322170734405518
pymemcache/test/test_benchmark.py::test_bench_set[pymemcache] 0.32326221466064453
pymemcache/test/test_benchmark.py::test_bench_get_multi[pymemcache] 11.463497877120972
pymemcache/test/test_benchmark.py::test_bench_set_multi[pymemcache] 6.002143859863281

This function is called at least once on every single operation. Its performance
is vital. This diff dramatically increases its performance and reduces the time
spent in it, the number of function calls during a set_many() cycle, and overall
time spent during that cycle all by ~65%.
@cgordon
Copy link
Collaborator

cgordon commented Aug 31, 2018

Nice improvement, thank you!

@cgordon cgordon merged commit 20af7f2 into pinterest:master Aug 31, 2018
@shargan shargan deleted the shargan/performance branch August 31, 2018 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants