Extracting Kubernetes runtime out of Vector code base #2963

MOZGIII · 2020-07-06T19:27:10Z

A lot of code to interact with Kubernetes API was implemented as part of #2653. We'd like to extract this code into a shared crate of some kind.

Motivation

We want to contribute to the community and let other projects use this work.
We'd like to get rid of the burden of maintaining all this code as part of the Vector code base.
We'd like part of the community that's better involved with k8s to do an audit and review of this code.

Context

There were a few reasons why we implemented the code ourselves in the first place, and didn't use some existing crate:

Actually, we are using k8s-openapi - it provides some machine-generated rust types and serialization/deserialization based on the OpenAPI spec for the Kubernetes API.
As part of the plan at chore: Kubernetes Integration RFC #2222, we wanted to build upon the existing code.
The new code is very heavily inspired by the WatchClient (it too doesn't rely on anything higher level than k8s-openapi, and uses evmap). WatchClient had very good design decisions, and I attempted to use higher-level crates, but quickly got back to the same core design for various reasons - more on this later.
Existing crates at the time were (and still are) very limiting. One of the requirements to the client crate was the ability to use our http clients facilities. None of the crates had this flexibility.
Additional layers of crates were an unwanted growth in complexity and a lack of flexibility.
Halfway through the implementation, it became evident that this factor is very important, as I had to debug and fix issues at the bottom of the library stack. It would've been way more difficult to do if there were additional layers.
It wasn't clear what architecture to use upfront, and I had to try and swap a few before I ended up with what we got. Using external crates limited the design space significantly, and not in our favor.
The most promising crate kube - the only one actively maintained. It's problematics though - lot of things are hard-coded to a particular implementation, rather than being generic around a trait. It also provides a lot of very basic custom functionality that's manually implemented, rather than being based on k8s-openapi (which it also depends on). The test coverage is also lacking. All those factors repelled me from relying on that crate so far.
The implementation we ended up with is well tested, modular and usable on it's own, with the only dependency being k8s-openapi. Technically, we can just extract into it's own crate, and invite users to depend on directly. We can implement adapters for compatibility with kube crate too.
Another option is we can work with people maintaining kube crate to improve the modularity of their crate, probably port our code there, and then we could switch to using kube crate. Although it's a lot more work than just extracting the code to a new crate, we'll get the community more involved in the process, and, eventually, transition the maintenance to the community completely.
We probably can start with extracting the crate (7), and then talking about sending patches to kube crate (8) - that way we'll start having the benefits of a cleaner code separation earlier - and there'd earlier be fewer things to think about for the Vector team.

Plan

Extract the implementation into a subcrate at lib/.
Alter the API, prepare for adoption into kube. Discuss things with them.
Prepare patches to kube.
Merge and support kube code changes.
Update our implementation with the upstreamed kube variant.

I'm very looking forward to sharing what I implemented for Vector with the community because there's nothing quite like it available out there yet.

The text was updated successfully, but these errors were encountered:

MOZGIII · 2020-07-08T17:25:43Z

So, how about I start with extracting the parts into a crate? Then we can see if one of the crates our there accepts our code.

@lukesteensen, @Hoverbear, what do you think?

Hoverbear · 2020-07-08T17:33:49Z

I'd definitely love to see a k8s-openapi crate we publish and work with folks to maintain, over one we keep locked inside Vector. :)

lukesteensen · 2020-07-10T19:46:25Z

Thanks for writing this up @MOZGIII!

Motivation

The big motivation is less about code review and auditing, and more about having the code used as widely as possible in a number of different contexts. This will give us much more confidence in its correct behavior than just reading it carefully and trying to find bugs ourselves.

One of the requirements to the client crate was the ability to use our http clients facilities. None of the crates had this flexibility.

This is one of the primary reasons for #2635. We do not want to have to use custom HTTP clients for everything, because we know they're not terribly common in libraries like this. Using one of the standard HTTP client crates should be just fine for our dependencies.

lot of things are hard-coded to a particular implementation, rather than being generic around a trait. It also provides a lot of very basic custom functionality that's manually implemented, rather than being based on k8s-openapi (which it also depends on). The test coverage is also lacking.

These are all reasonable concerns, but I do want to be careful about differentiating between how we think the code "should" be written and factors that actually affect the user-facing behavior. If the crate functions reliably and offers the APIs we need, that should be enough (especially for an initial implementation).

We probably can start with extracting the crate

I think this is a great first step! The big benefit here will be to draw a clear line between the k8s implementation and the API that Vector depends on. Once that API is clear, we can use it to have more concrete conversations about changes we'd like to see in community crates.

As a second step, I'd suggest that we write up a short, concise proposal to share the the kube maintainers to gauge their in accepting the various changes. Again, I want to be careful here to differentiate between matters of behavior and reliabiltiy, and matters of opinion on code structure. The goal should be to contribute a minimal set of changes to kube such that it fulfills Vector's needs.

If the kube maintainers seem willing, we can then move forward with those PRs and adjust our internal dependencies as progress is made.

MOZGIII · 2020-07-10T19:55:42Z

This is one of the primary reasons for #2635. We do not want to have to use custom HTTP clients for everything, because we know they're not terribly common in libraries like this. Using one of the standard HTTP client crates should be just fine for our dependencies.

Unfortunately, this issue is fairly recent, compared to when the implementation was written. It's going to make the transition easier.

If the crate functions reliably and offers the APIs we need, that should be enough (especially for an initial implementation).

Absolutely! Unfortunately, Arnavion/k8s-openapi#70 was only found by us, despite kube-rs using the same k8s-openapi dependency for ages. This kind of speaks for itself. ☹️ I'm not happy about it, but it's just how it is.

I think this is a great first step! The big benefit here will be to draw a clear line between the k8s implementation and the API that Vector depends on. Once that API is clear, we can use it to have more concrete conversations about changes we'd like to see in community crates.

As a second step, I'd suggest that we write up a short, concise proposal to share the the kube maintainers to gauge their in accepting the various changes. Again, I want to be careful here to differentiate between matters of behavior and reliabiltiy, and matters of opinion on code structure. The goal should be to contribute a minimal set of changes to kube such that it fulfills Vector's needs.

If the kube maintainers seem willing, we can then move forward with those PRs and adjust our internal dependencies as progress is made.

Yep, sounds good!

MOZGIII · 2020-07-10T20:26:04Z

I'll begin the process of extracting the implementation into a crate then.

lukesteensen · 2020-07-10T21:05:09Z

Sounds good! Pulling out into lib/ should be an easy way to get started.

MOZGIII · 2020-07-13T21:17:25Z

I support I'd have to unwrap the emit! macros for the code that's ported into a crate?

lukesteensen · 2020-07-14T20:43:59Z

I don't think we need to rely on emitting events internal to the k8s client. We can instrument in Vector itself around use of the API.

MOZGIII · 2020-07-14T21:14:34Z

A lot of the discussion at the original PR was on switching from logs to emits. Some pieces are elegantly aligned to allow instrumenting around, but others - more "client-internal" things - aren't that easy to add instrumentation. For those, I think the best course of action is to switch them back to events. I'll be easier to discuss this in better detail when I submit the PR.

lukesteensen · 2020-07-14T21:18:34Z

Yes, it's likely some bits will need to change now that they're in a different context than when they were originally reviewed (internal to Vector vs part of an external crate).

We should align the new crate's API with that of kube (since the eventual goal is to use it), and anything behind that API should do normal library-style logging.

MOZGIII · 2020-07-19T08:41:55Z

I updated the plan in the OP.

jszwedko · 2022-12-30T00:27:18Z

Closing since we ended up using kube-rs instead.

binarylogic added the type: tech debt A code change that does not add user value. label Jul 7, 2020

binarylogic added domain: deps Anything related to Vector's dependencies domain: platforms Anything related to Vector's supported platforms labels Jul 10, 2020

binarylogic added this to the 2020.07.19 - The Great Orientation milestone Jul 20, 2020

binarylogic modified the milestones: 2020.07.20 - The Great Orientation, 2020.08.03 - Observability Bonanza Aug 2, 2020

binarylogic assigned MOZGIII Aug 6, 2020

binarylogic removed this from the 2020.08.03 - Observability Bonanza milestone Aug 17, 2020

jamtur01 unassigned MOZGIII Sep 11, 2020

binarylogic added the platform: kubernetes Anything `kubernetes` platform related label Jan 6, 2021

spencergilbert mentioned this issue Dec 30, 2021

chore: kubernetes_logs rewrite RFC #10301

Merged

jszwedko closed this as completed Dec 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting Kubernetes runtime out of Vector code base #2963

Extracting Kubernetes runtime out of Vector code base #2963

MOZGIII commented Jul 6, 2020 •

edited

Loading

MOZGIII commented Jul 8, 2020

Hoverbear commented Jul 8, 2020

lukesteensen commented Jul 10, 2020

MOZGIII commented Jul 10, 2020

MOZGIII commented Jul 10, 2020 •

edited

Loading

lukesteensen commented Jul 10, 2020

MOZGIII commented Jul 13, 2020 •

edited

Loading

lukesteensen commented Jul 14, 2020

MOZGIII commented Jul 14, 2020

lukesteensen commented Jul 14, 2020

MOZGIII commented Jul 19, 2020

jszwedko commented Dec 30, 2022

Extracting Kubernetes runtime out of Vector code base #2963

Extracting Kubernetes runtime out of Vector code base #2963

Comments

MOZGIII commented Jul 6, 2020 • edited Loading

Motivation

Context

Plan

MOZGIII commented Jul 8, 2020

Hoverbear commented Jul 8, 2020

lukesteensen commented Jul 10, 2020

MOZGIII commented Jul 10, 2020

MOZGIII commented Jul 10, 2020 • edited Loading

lukesteensen commented Jul 10, 2020

MOZGIII commented Jul 13, 2020 • edited Loading

lukesteensen commented Jul 14, 2020

MOZGIII commented Jul 14, 2020

lukesteensen commented Jul 14, 2020

MOZGIII commented Jul 19, 2020

jszwedko commented Dec 30, 2022

MOZGIII commented Jul 6, 2020 •

edited

Loading

MOZGIII commented Jul 10, 2020 •

edited

Loading

MOZGIII commented Jul 13, 2020 •

edited

Loading