Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to ship results to k6 cloud #9

Closed
dgzlopes opened this issue Feb 19, 2021 · 5 comments · Fixed by #86
Closed

Add support to ship results to k6 cloud #9

dgzlopes opened this issue Feb 19, 2021 · 5 comments · Fixed by #86
Labels
enhancement New feature or request

Comments

@dgzlopes
Copy link
Member

Here we're running a lot of k6 instances in parallel, but all of them are part of the same test.

I would like to ship the results from all these instances as a unique test run on the cloud.

@dgzlopes dgzlopes changed the title Add support to ship results to the cloud Add support to ship results to k6 cloud Feb 19, 2021
@simskij
Copy link
Contributor

simskij commented Feb 20, 2021

Discussed this briefly with @sniku at some point, and we still lack part of what would be needed to make this happen, or at least to make it count as one test run.

We can already ship logs to the cloud using the good ol' -o cloud, but what we'd really need is a public API endpoint for shoveling in data with a custom job id, or - which I'd much prefer - a way to let the cloud know the aggregate root and boundary (borrowing terms from event sourcing here that I think make sense in this context).

I don't know if that was ever added to any roadmap, however. Let's talk a bit more internally and see if we'd be able to make this happen. 👍🏼

@yorugac
Copy link
Collaborator

yorugac commented Nov 29, 2021

This issue can be seen as having two-layered problem. One layer contains what must be done at the level of k6 OSS and / or k6 Cloud to have support for this feature. And another layer is how best to adapt k6-operator for the new feature within a Kubernetes cluster. Here I'll briefly describe considerations made for the latter.

In order to have cloud output, we must acquire new test run ID from k6 Cloud prior to starting any test jobs. Firstly, this implies adding two new stages of the controller which are currently called "initialization" and "initialized". The "initialization" stage indicates that operator is in the middle of performing a check for cloud output and may be retrieving info from k6 Cloud. The "initialized" stage indicates that the checks are finished and it's fine to proceed with the creation of runners for the test.

Retrieval of information from the k6 Cloud depends on additional invocation of k6 command itself. (To be fully precise, there is another way to do this: by importing internal k6 libs into k6-operator but it'd have resulted in a much more complicated code of operator and in creation of code duplicates that'd have needed additional maintenance which altogether just smells of bad design. So that approach was set aside in favor of k6 invocation.)

IOW, there must be a container with k6 binary to be called once during "initialization" stage. The straight-forward solution is to start a separate pod with k6. Since runners in k6-operator already use loadimpact/k6 image, it could be the same Dockerfile as for runner. The problem here is how best to extract output from that container. What I've looked into:

  1. Start a separate pod and create a persistent volume for the output. Cons: we cannot make assumptions about the cluster and someone needs to ensure that the volume exists. Given that the cloud output is an optional feature, it is not a good solution to make creation of additional volume mandatory on each run.
  2. Start a separate pod and use a configmap to store the output. But sadly, configmaps are not meant to be writable from inside the pod (related k8s issue).
  3. There is a way to store data in the shared volumes between containers in the same pod. Options considered here:
    a) We need the data in k6-operator itself so it makes sense to start additional container along with the manager. By default, manager's pod already runs with two containers (at the moment of writing). Since pod spec is as good as immutable by design, addition of container cannot be done at runtime, only at deployment. But since we're already relying on kustomize, that should be relatively safe to do as an optional configuration. However, this is still a bit harder to do as loadimpact/k6 image is not meant for usage as a daemon but here, k6 must be called only when k6-operator requires it. This can be circumvented but will result either in additional image or a rather unwieldy hack. Another drawback is that in this particular problem, current design of k6 CLI requires preprocessing of arguments before invocation (there are differences between options allowed for k6 inspect, k6 archive and k6 run calls). This can also be circumvented but with another hack in the setup.
    b) Similarly, an init container can be added to the manager which will write to the shared volume. But it has the same drawbacks as a previous option and in addition, init container is started before the main one and it will not know where configmap with the test is located and very likely will run before configmap's creation.
    c) Another interesting option might an be ephemeral container but it is still in a very alpha stage so it's probably best to look into it separately.
  4. Start a separate container and make operator read its logs directly. Logically, this should have been a second point in considerations but the funny thing is that currently, controller-runtime client of k8s does not have support for that: it is even officially recommended to create a separate REST client to retrieve logs in such cases. Still, initial support for output cloud will be done with this solution, at least until k6 CLI change which might happen in some of the future releases.
  5. Change operator's image to contain k6 command inside. Image sizes for comparison (at the time of writing):
ghcr.io/grafana/k6            latest        a4fb893ad7d1   8 days ago     31.8MB
ghcr.io/grafana/operator      latest        ee25d88318bc   5 weeks ago    47.2MB

Such a switch seems almost feasible at first glance but operator's code will need to call k6 with os.Exec and this change of image seems as something that should go through an additional evaluation.

@knechtionscoding
Copy link
Contributor

Does it make sense to make it part of the operator container/controller? Wouldn't it make sense to have that be another container controlled by the operator, similar to the starter, that runs prior to the test actually kicking off?

Then you circumvent the order of operations problems, you circumvent issues around permissions (the k6 operator may not have access to read configmaps in the namespaces it is creating runners/starters in), and you spread out the load.

@yorugac
Copy link
Collaborator

yorugac commented Nov 29, 2021

@knechtionscoding thanks for adding your input!
Permissions in the "test namespace" is a very good point 👍 Well, as mentioned, currently I'm going with the 4th option: using "initializing container" which would be similar to ~ starter container but as the first stage.

Please share if you think there are additional concerns that should be addressed 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants