-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improves the performance of getHandle() when routing requests. #484
Conversation
- Replaces the use of stri_match() with stri_match_first_regex() directly, removing the expensive and unnecessary calls to match.arg(). - Imports stri_match_first_regex() into the environment to remove the need to call `::`, which was taking up about 25% of the execution time. - Takes advantage of the fact that `seq` can be NULL in for loops to simplify the getHandle() function. This reduces the overhead of getHandle() by about 50%.
Not that my opinion on this matters much, but this is definitely NEWS-worthy. This is what the "improvements" section is for! Like you, I work on production APIs and a 10% speed-up for simple endpoints is great. Also I would say the change is public-facing insofar as there will be a measurable effect on all existing plumber APIs. |
Do you mind sharing how you loadtest or determine the performance increase? Thank you! |
Sure, although I think how I actually did it might be a bit more involved than actually necessary to show the difference:
|
Thank you. I should be able to repeat something similar on my end. |
A much simpler way, now that I think about it, is just to test the throughput of a sample API before and after the change. |
Correct. The before and after check works for validating the speed up. Sounds like you use something similar to profvis for inspecting slow code sections. ————— @blairj09 and I have been looking into future integrations for plumber to make sure the codebase doesn’t get slower with a particular pr. Currently we have been looking at |
I'm (obviously) pretty interested in performance regressions as well, and would be happy to see that kind of work. |
I would still keep the check on NULL handlers in the route function. Checking for null handlers takes about nothing; looping on a NULL takes more. Performance are increase on valid call but you would decrease performance on NULL handlers. Still great catch. > a <- expression(for (h in NULL) {}); b <- expression(if (!is.null(NULL)) {})
> microbenchmark::microbenchmark(a, b)
# Unit: nanoseconds
# expr min lq mean median uq max neval
# a 0 0 26.11 1 1 2553 100
# b 0 0 0.47 0 1 1 100 |
The |
It means it is valid to loop on a NULL value in R. tldr : getHandle does not need to be changed. |
This PR contains performance tweaks to the internal
getHandle()
function used inside of the router, which in turn calls thecanServe()
method for each handler until it finds one that can take the request. Since this function is essentially "overhead" on every request, it is ideal if it is fast.The PR makes the following changes:
Replaces the use of
stri_match()
withstri_match_first_regex()
directly, removing the expensive and unnecessary calls tomatch.arg()
.Imports
stri_match_first_regex()
into the environment to remove the need to call::
, which was taking up about 25% of the execution time.Takes advantage of the fact that
seq
can beNULL
in for loops to simplify thegetHandle()
function.This reduces the overhead of
getHandle()
by about 50%. From my own testing, this function moved from about 20% of program execution time down to 12% for simple endpoints in our production APIs; for more complex endpoints it moved down from 4% to 2%.PR task list:
devtools::document()
This is not a public-facing change, so I don't think it needs an entry in the
NEWS.md
file, but I can add one if you disagree. Nor does it really require new tests -- existing integration tests should cover these changes.