Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YUNIKORN-2416] Cleanup replace directives #794

Closed
wants to merge 3 commits into from

Conversation

chenyulin0719
Copy link
Contributor

@chenyulin0719 chenyulin0719 commented Feb 22, 2024

What is this PR for?

  • Cleanup unnecessarily golang.org/x/lint modules in replace directives.
  • Upgrade to the latest yunikorn-core/yunikorn-sheduler-interface
  • Replace github.com/opencontainers/runc:v1.1.10 by v1.1.12 to fix CVE-2024-21626

Note: There are 2 vulnerable dependencies existing before this PR, and it can't be fixed by replace directives.

  • go:gopkg.in/square/go-jose.v2:v2.6.0 is vulnerable (Cxb6dee8d5-b814)
    (From k8s.io/kubernetes@v1.29.2) (The packege is under public archive.)
  • go:k8s.io/kubernetes:v1.29.2 is vulnerable (CVE-2020-8562)

This dependencies required fix from K8S.

What type of PR is it?

  • - Bug Fix
  • - Improvement
  • - Feature
  • - Documentation
  • - Hot Fix
  • - Refactoring

Todos

NA

What is the Jira issue?

https://issues.apache.org/jira/browse/YUNIKORN-2416

How should this be tested?

  1. Run CVE check (or use your IDE's vulnerability analysis tool).
  2. Run E2E test with this new dependencis

E2E passed in my side:
https://github.com/chenyulin0719/yunikorn-k8shim/actions/runs/8004883059

Screenshots (if appropriate)

There are 2 existing CVE and can't be fixed by replace.
image

CVE check pass with govulncheck:
image

Questions:

NA

@chia7712
Copy link
Member

Had you run go mod tidy to update indirects?

Copy link

codecov bot commented Feb 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.55%. Comparing base (766afd0) to head (443a9c1).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #794   +/-   ##
=======================================
  Coverage   71.55%   71.55%           
=======================================
  Files          43       43           
  Lines        6332     6332           
=======================================
  Hits         4531     4531           
  Misses       1599     1599           
  Partials      202      202           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@wilfred-s wilfred-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 for this approach.
We must not touch the indirect require entries. Those are maintained by the code. We leave them as is and use a replace to override with what we want/need.
When updating the mod file, and the indirect dependency is updated, we need to check if the replace value can be cleaned up or not.
The last couple of changes made to go.mod stepped away from this but we need to move back to that as we have no guarantee that the indirect references are stable and we could regress.

Copy link
Contributor

@craigcondit craigcondit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also -1 on this. As I've previously indicated, I'd rather "over-specify" our replaces and ensure that we don't trigger regressions by omission.

@chia7712
Copy link
Member

@wilfred-s @craigcondit I have a question which is unrelated to this PR for "indirect".

It seems to me whole "indirect" should be generated by go mod tidy, but the "indirect" in k8shim does not. Do I misunderstand anything?

@craigcondit
Copy link
Contributor

craigcondit commented Feb 22, 2024

It seems to me whole "indirect" should be generated by go mod tidy, but the "indirect" in k8shim does not. Do I misunderstand anything?

What makes you think it doesn't work that way in k8shim?

@chia7712
Copy link
Member

What makes you think it doesn't work that way in k8shim?

I remove all indicates and then run "go mod tidy". Some indirect deps get changed.

@craigcondit
Copy link
Contributor

What makes you think it doesn't work that way in k8shim?

I remove all indicates and then run "go mod tidy". Some indirect deps get changed.

That's expected - there were probably some indirect references that got updated via core changes. That's one of the reasons we try to be explicit in the replace stanza so that we don't accidentally regress due to a seemingly unrelated change.

@chia7712
Copy link
Member

That's expected - there were probably some indirect references that got updated via core changes. That's one of the reasons we try to be explicit in the replace stanza so that we don't accidentally regress due to a seemingly unrelated change.

just curious. How to maintain/update the version of indirect dependency for k8shim if we don't rely on the go mod tidy (to auto-generate indirect dependencies)?

@craigcondit
Copy link
Contributor

just curious. How to maintain/update the version of indirect dependency for k8shim if we don't rely on the go mod tidy (to auto-generate indirect dependencies)?

We absolutely do rely on go mod tidy. However, if we don't have a replace, go mod tidy is likely to revert the indirect dependencies to versions that are older than we'd like. For common dependencies that get frequent updates for CVEs, it's far simpler to just rev the replace stanza and let go mod tidy run without fear of regressing.

@chia7712
Copy link
Member

We absolutely do rely on go mod tidy. However, if we don't have a replace, go mod tidy is likely to revert the indirect dependencies to versions that are older than we'd like. For common dependencies that get frequent updates for CVEs, it's far simpler to just rev the replace stanza and let go mod tidy run without fear of regressing.

@craigcondit thanks for nice explanation, and I have one more question. The version of some indirect dependencies get older after I re-create indirect stanza. For example: the version of github.com/imdario/mergo is changed from v0.3.7 to v0.3.6. It seems to me the version v0.3.6 is resolved by go and we should use it unless there is known CVE or regression, right?

@craigcondit
Copy link
Contributor

The version of some indirect dependencies get older after I re-create indirect stanza. For example: the version of github.com/imdario/mergo is changed from v0.3.7 to v0.3.6. It seems to me the version v0.3.6 is resolved by go and we should use it unless there is known CVE or regression, right?

No, we should keep the newer version unless there is something that breaks. It's not just CVE updates to consider.

@chia7712
Copy link
Member

No, we should keep the newer version unless there is something that breaks. It's not just CVE updates to consider.

agree to that is not just CVE updates to consider. However, how to choose the suitable and newer version for it? for instance, why using v0.3.7 rather than v0.3.10? Sorry for raising a bunch of questions. I try to learn the version management via this case.

@craigcondit
Copy link
Contributor

agree to that is not just CVE updates to consider. However, how to choose the suitable and newer version for it? for instance, why using v0.3.7 rather than v0.3.10? Sorry for raising a bunch of questions. I try to learn the version management via this case.

We generally update as needed, either for compatibility or CVE issues. Once we've moved forward though, we'd like to avoid moving back again.

@chenyulin0719
Copy link
Contributor Author

chenyulin0719 commented Feb 22, 2024

Base on above discussion, may I ask two questions to confirm the best practice for maintaining replace directives?

Q1: When should we upgrade version in replace directives
A1:

  • New CVE detect
  • Compatible issue.

Q2: When should we remove replace directives:
A2:

  • When all the indirect module version is higer than replace directive's version. (No rollback to older version)

@craigcondit
Copy link
Contributor

craigcondit commented Feb 22, 2024

Q1: When should we upgrade version in replace directives A1:

  • New CVE detect
  • Compatible issue.

CVEs: Definitely, and this is the most common reason to update.

Compatibility depends on if there are issues encountered (sometimes modules of one version won't compile / run properly with dependencies of different versions). This is rare, but does happen. Case-by-case basis. Usually, semantic versioning applies and it's safe to update to x.y.[latest]. However, most of our indirect dependencies come from Kubernetes itself, and we'd like to be as compatible with upstream as possible (even bug for bug if need be).

Q2: When should we remove replace directives: A2:

  • When all the indirect module version is higer than replace directive's version. (No rollback to older version)

IMO, we should not remove the replacements. It just becomes a maintenance nightmare. Any indirect dependency update anywhere in the in chain may pull in a later (or earlier) release; it's a lot of work to identify these and simpler to just update to the latest versions. Since we stay on top of CVEs, we're almost always going to be higher than the versions requested by indirect dependencies, so it's mostly a moot issue and therefore not worth expending dev effort on. It's harmless to have a few extra replaces here.

Compiling against newer K8s releases is where we end up doing most of the replacements. I've typically done this by going through the versions pulled in by the new K8s release and ensuring we replace to >= those versions if we're not already.

@chenyulin0719
Copy link
Contributor Author

Thanks for the prompt reply.

The version of some indirect dependencies get older after I re-create indirect stanza. For example: the version of github.com/imdario/mergo is changed from v0.3.7 to v0.3.6. It seems to me the version v0.3.6 is resolved by Go and we should use it unless there is known CVE or regression, right?

I just did an investigation to understand why we manually set github.com/imdario/mergo to v0.3.7 in indirect require entry instead of the version Go resolved.

The indirect require entrygh.neting.cc/imdario/mergo v0.3.7 // indirect was added in 2203ad8, and come from spark-on-k8s-operator.

image

But after we remove spark-on-k8s-operator, we didn't regenerate indirect require entry.
go mod tidy just keep the newer version between before and after.
This is the root cause of the inconsistency.

image

In this case, I think the indirect require can be change to the v0.3.6.
If in any reason we must keep v0.3.7, we should put it in replace directive and comments.

There are some other inconsistency. The review might be exhausting, but it should be an one-time job.
After that we can ensure indirect require is equal to what Go resolved.

@craigcondit
Copy link
Contributor

I just did an investigation to understand why we manually set github.com/imdario/mergo to v0.3.7 in indirect require entry instead of the version Go resolved.

So you're recommending that we downgrade to v0.3.6 of mergo when we've successfully tested against a newer version? There's likely to be some fixes in v0.3.7 that we have gotten the benefit of, even if everything still compiles and runs.

The point is, this requires a lot of effort, that effort is not a one-time cost (any time dependencies are added or removed this may need to be done), and it's not obvious that it provides any net benefit. Let's just leave well enough alone.

@chenyulin0719
Copy link
Contributor Author

So you're recommending that we downgrade to v0.3.6 of mergo when we've successfully tested against a newer version? There's likely to be some fixes in v0.3.7 that we have gotten the benefit of, even if everything still compiles and runs.

Yes, that's what I suggested. Everytime we made change on go.mod, we should:

  1. Manually remove indirect require entry
  2. Run go mod tidy
  3. Run tests
  4. If anything wrong, put the fix version in replace entry and add comment

Here is a tradeoff decision to be made here.

Pros:

  • We know what module version is come from parent module's require.
  • We document down what compatible/security issue happened before in this project.

Cons:

  • Require a lot of effort
  • We might loss benefit from newer version

@craigcondit
Copy link
Contributor

  1. Manually remove indirect require entry
  2. Run go mod tidy
  3. Run tests
  4. If anything wrong, put the fix version in replace entry and add comment

Sorry, I have to respectfully disagree here. This is not only time consuming, but error prone, because every time this is done we need to audit for CVEs that might have been inadvertently reintroduced. This is not a hypothetical issue - it has happened on several occasions.

Pros:
We know what module version is come from parent module's require.

Why is this important? We override these to pull newer versions, and if those newer versions cause no issues, downgrading is pointless.

We document down what compatible/security issue happened before in this project.

This isn't something we do, nor is there a central place to track this within the project.

The very real cons of this approach significantly dwarf any of the perceived benefits.

@chenyulin0719
Copy link
Contributor Author

chenyulin0719 commented Feb 22, 2024

Understood, I can't find other reason to support it.
Thanks for sharing your viewpoint. :)

For this PR, I'm not sure what should be fixed. I didn't manually change the indirect require entry. (It was change by go mod tidy with newer core version)
Should I move golang.org/x/lint back?

For core and sheduler-interface, I've removed the golang.org/x modules in replace entry.

Should I add a commit to move it back?

@craigcondit
Copy link
Contributor

For this PR, I'm not sure what should be fixed. I didn't manually change the indirect require entry. (It was change by go mod tidy with newer core version) Should I move golang.org/x/lint back?

Put the x/lint replace back; also remove the comment from the other added replace.

For core and sheduler-interface, I've removed the golang.org/x modules in replace entry.

Should I add a commit to move it back?

Yes, please open JIRAs and PRs to replace the overrides that were already in place.

@craigcondit
Copy link
Contributor

craigcondit commented Feb 22, 2024

@chenyulin0719 - The core / shim PRs have been. merged. In addition to the other changes requested, can you please update the scheduler-interface and core references to the latest? These are:

core: v0.0.0-20240222210045-b926dce1f914
si: v0.0.0-20240222205935-94c25b6d2579

Copy link
Contributor

@craigcondit craigcondit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending pre-commit checks.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

go.mod Show resolved Hide resolved
Copy link
Contributor

@craigcondit craigcondit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending e2e tests.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants