Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrival-rate based VU executor #550

Closed
robingustafsson opened this issue Mar 21, 2018 · 12 comments · Fixed by #1007
Closed

Arrival-rate based VU executor #550

robingustafsson opened this issue Mar 21, 2018 · 12 comments · Fixed by #1007
Assignees
Labels
Milestone

Comments

@robingustafsson
Copy link
Member

robingustafsson commented Mar 21, 2018

Like many load testing tools before it, k6 is modelled around the concept of Virtual Users or VUs, the idea being that each VU independently executes the test script specified by the user in a concurrent maner.

There are two common ways to simulate this concurrency of clients hitting the system under test (SUT). Let’s call the first one “looping VUs” (aka "closed system model" [1]) and the second one, and the subject of this issue, “arrival-rate VUs” (aka "open system model" [1]).

The VU executor supported by k6 is of the looping type (aka "closed system model" [1]), meaning it can be influenced by the system under test (SUT).

Looping VUs

You can usually control the number of looping VUs by specifying ramping profiles (or stages as they’re known in k6), a specification for how to gradually increase or decrease the number of concurrently active VUs throughout the duration of the test.

Each VU will for as long as it’s active execute the test script in the equivalent of a while-true loop. And each VU executes each line of code in the test script in sequence and synchronously; a mix of logic, requests and sleep time (aka think time).

The problem

This means the overall request pacing of the test will ultimately be determined by how well the SUT is able to handle the traffic; in other words if the SUT starts responding more slowly the VU's sending requests will block for a longer time, thus effectively the SUT is throttling/controlling the requests pacing.

Arrival-rate VUs

Arrival-rate based VU execution (the "open system model") solves the entangled relationship between the looping VU and the SUT, by decoupling the request pacing from the capabilities of the SUT to cope with the traffic.

RPS

One of the practical use cases for arrival-rate based VU execution is that it makes it easier to model tests around requests/second (RPS, or QPS as some call it). If you make a script that generates one request (without sleeps) and then specify your ramping profile in terms of VU arrivals per second that'll be equivalent to requests per second.

Proposed API spec

export let options = {
    // Allow mixing looping VUs and arival-rate VUs in the same test
    // ("looping" being the default type)
    stages: [
        { target: "100", duration: "5m", type: "arrival-rate" },
        { target: "100", duration: "5m", type: "looping" },
        { target: "100", duration: "5m" } // same as previous step
    ]
};

export default function() {
    ...
}

If the mixing of VU execution types proves to be too big of a project I propose we split it into two steps. First introduce arrival-rate based VU executor and let the user choose which type of executor to use per test in the options or the command-line (this is probably needed either way as we'd need a mechanism for users to specify they want arrival-rate based execution when just specifying vus and duration in the options).

Command-line

On the command-line we need to allow the user to specify which type of VU executor they want both when specifying the combination of --u\--vus NUM and -d\--duration, but also when specifying the stages with one or more -s\--stage.

For --u\--vus and -d\--duration I propose we add a new flag called -x\--executor with allowed values of "looping" and "arrival-rate". For -s\--stage we can extend the value format from [duration]:[target] to [duration]:[target]:[executor-type].

[1] - http://users.cms.caltech.edu/~adamw/publications-openclosed.html

@antekresic
Copy link
Contributor

I did some research on this feature and took a look at how we could implement it in k6 using the current architecture.

I believe I have found an issue with this approach. Right now, the current model (looping VUs) pre-instantiates maximum amount of VUs needed for a test run. In order to do that, you need to find the max amount of VUs needed for the test. With arrival-rate VUs, that number is not so easy to determine since it depends on VU execution (whose behaviour is set by the user and can be very long) . The number of active VUs depends on the time it takes for a VU to finish, which, in turn, depends on its behaviour and the responsiveness of the system at test.

We cannot instantiate VUs on demand since it can mess with the real-time metrics and performance. By running some basic profiling on k6, I have found that a single VU instance takes up ~1.2 MB of memory and ~17ms of average time to instantiate a single instance (on my test computer). If we are talking about, lets say, 100 req/s, and an optimistic 30s for each VU to finish, we are talking about ~3.5 gigs of memory just for the VUs!

I have also compared the impact of preallocated vs. dynamically allocating VUs at test runtime by making some minor tweaks to k6 core. Results show that performance is severely impacted depending on the amount of instances needed to create. Here are the test details: link

One solution to this would be to limit the time these arrival-rate VUs have to finish, which would reduce the total amount of instances needed for the test and make it configurable so that users with a lot of memory can have bigger timeouts.

@na--
Copy link
Member

na-- commented Apr 17, 2018

Hmm since arrival-rate based executors won't loop like the current ones, can't we just use sync.Pool or something like it so we don't allocate new VUs every time, but instead reuse existing ones if they are available?

@antekresic
Copy link
Contributor

antekresic commented Apr 17, 2018

Not sure if I wrote my comment wrong but I have taken into account that we would reuse VUs . The amount needed to have instantiated in a single moment would be large if they take a long time to execute.

100 requests per second * 30 seconds (assuming they need that long to finish) = 3000 VUs running at the same time.

@na--
Copy link
Member

na-- commented Apr 17, 2018

Maybe we can expose this implementation detail (that VUs take time and CPU to instantiate) and add a parameter that allows users to specify how many VUs they want to instantiate before the test starts? Some users may be fine with a quicker startup time and decreased performance, others may want to pre-instantiate some or all of the VUs for optimized execution.

Also, @robingustafsson, reading the original issue again, I'm not sure the proposed API spec is clear and flexible enough:

  • target means 2 different things depending on the type
  • it doesn't allow us to mix the two executors in a single stage, which is something that may be useful - "have X constantly looping VUs as a baseline and launch Y new ones per second"
  • I don't like the magic "same as previous step" configuration
  • it ties our hands a bit when we want to implement different ramp up and down algorithms

Something like this seems more intuitive, flexible and extensible:

export let options = {
    // Allow mixing looping VUs and arival-rate VUs in the same test
    // ("looping" being the default type)
    stages: [
        // Gradually ramp up from 0 to 50 looping VUs over 5 minutes
        { duration: "5m", looping: 50, rampUp: "linear" }, 
        // Instantly start running 10 new arrivals every second
        { duration: "5m", looping: 50, arrivals: 10, rampUp: "instant" },
        // Double the looping VUs with somthing like this https://github.com/d3/d3-ease#easePolyOut
        { duration: "5m", looping: 100, arrivals: 10, rampUp: "polyout" },
    ]
};

Don't like the option names, but I can't think of better ones at the moment. Not sure how to translate something like this to CLI flags as well...

@luizbafilho
Copy link
Contributor

Just to give us something to compare with, here is how tsung exposes the arrival config: http://tsung.erlang-projects.org/user_manual/conf-load.html

@antekresic
Copy link
Contributor

I agree with @na-- , having the ability to combine two types of rate systems would be useful in simulating more realistic loads and thus generating more accurate metrics in regards what the system at test can do. I didn't mention it just because of the technical complexity it brings to the table and that it might be better to leave it out of scope for this initial take on adding arrival-rate test execution.

Then again, if we are thinking about doing it later, we should think about the API spec now, just to avoid changing it too much and too often.

@robingustafsson
Copy link
Member Author

First, nice job @antekresic on the initial investigation on pre-allocated vs dynamic allocation of VUs. The 1.2 MB per VU number sounds very high, I suppose most of that is goja stuff, but something we should investigate more separately.

When it comes to the VU allocation I agree with @na-- that we should expose the config knobs to the user, so that they can make the right tradeoffs for their use case. IMO the default should be to pre-allocate up to a calculated max VUs, with a warning + confirmation in the terminal if that'd mean potentially claiming more than say 85% of the available physical RAM (given the rough estimate of 1.2 MB per VU).

Combining the two execution types might make sense, but I think the most common would be an either or situation so I think we should focus on that in a first iteration. If you really want a mix you could run two parallel k6 instances with different types as a workaround. But yes I agree, we should think of an API that could support a mix from the start.

One scenario I reckon will be common is to be able to say ramp from arrival-rate A to arrival-rate B in X time units. Being able to ramp up requests/second basically. That's what the { target: "100", duration: "5m", type: "arrival-rate" } meant in my head, start at 0 arrival-rate and ramp to arrival-rate of 100 VUs/second over a 5 mins period.

Good point @luizbafilho also, that we can look at other tools to see if there's something we can take inspiration from when designing the API. I'll add Gatling to that list: https://gatling.io/docs/2.3/general/simulation_setup/

@antekresic
Copy link
Contributor

@robingustafsson yep, its mostly the goja runtime that is taking up the memory for the VUs.

Well, we already have exposed the setting for max VUs, maybe we could add the memory warning in a separate issue since that can be done right now, independent of this feature.

What do you guys think about adding a separate warning when we go over max VUs? Just warning the user that performance is impacted b/c we had to instantiate VUs mid-test and suggest upping the value. (with much better wording, obviously)

I'll start messing around with the arrival-rate executor and make a WIP PR once I have something I can share with you guys.

@na--
Copy link
Member

na-- commented Apr 26, 2018

Sounds good, and I like the idea that we just warn the user if they overshoot the pre-allocated amount of VUs.

@robingustafsson
Copy link
Member Author

@antekresic Yup, sounds good to me to warn when going above the pre-allocated VUs.

@joseray
Copy link

joseray commented Jan 17, 2020

@antekresic how do you made this:

By running some basic profiling on k6, I have found that a single VU instance takes up ~1.2 MB of memory and ~17ms of average time to instantiate a single instance

I would like to be able to measure the performance of a test on one machine.

Thank you.

@antekresic
Copy link
Contributor

@joseray I just used the built-in Go runtime profiler: https://golang.org/pkg/runtime/pprof/

@na-- na-- modified the milestones: v1.0.0, v0.27.0 May 21, 2020
@na-- na-- closed this as completed in #1007 Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants