-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YUNIKORN-2416] Cleanup replace directives #794
Conversation
Had you run |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #794 +/- ##
=======================================
Coverage 71.55% 71.55%
=======================================
Files 43 43
Lines 6332 6332
=======================================
Hits 4531 4531
Misses 1599 1599
Partials 202 202 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 for this approach.
We must not touch the indirect require entries. Those are maintained by the code. We leave them as is and use a replace to override with what we want/need.
When updating the mod file, and the indirect dependency is updated, we need to check if the replace value can be cleaned up or not.
The last couple of changes made to go.mod stepped away from this but we need to move back to that as we have no guarantee that the indirect references are stable and we could regress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also -1 on this. As I've previously indicated, I'd rather "over-specify" our replaces and ensure that we don't trigger regressions by omission.
@wilfred-s @craigcondit I have a question which is unrelated to this PR for "indirect". It seems to me whole "indirect" should be generated by |
What makes you think it doesn't work that way in k8shim? |
I remove all indicates and then run "go mod tidy". Some indirect deps get changed. |
That's expected - there were probably some indirect references that got updated via core changes. That's one of the reasons we try to be explicit in the replace stanza so that we don't accidentally regress due to a seemingly unrelated change. |
just curious. How to maintain/update the version of indirect dependency for k8shim if we don't rely on the |
We absolutely do rely on |
@craigcondit thanks for nice explanation, and I have one more question. The version of some |
No, we should keep the newer version unless there is something that breaks. It's not just CVE updates to consider. |
agree to that is not just CVE updates to consider. However, how to choose the suitable and newer version for it? for instance, why using |
We generally update as needed, either for compatibility or CVE issues. Once we've moved forward though, we'd like to avoid moving back again. |
Base on above discussion, may I ask two questions to confirm the best practice for maintaining replace directives? Q1: When should we upgrade version in replace directives
Q2: When should we remove replace directives:
|
CVEs: Definitely, and this is the most common reason to update. Compatibility depends on if there are issues encountered (sometimes modules of one version won't compile / run properly with dependencies of different versions). This is rare, but does happen. Case-by-case basis. Usually, semantic versioning applies and it's safe to update to x.y.[latest]. However, most of our indirect dependencies come from Kubernetes itself, and we'd like to be as compatible with upstream as possible (even bug for bug if need be).
IMO, we should not remove the replacements. It just becomes a maintenance nightmare. Any indirect dependency update anywhere in the in chain may pull in a later (or earlier) release; it's a lot of work to identify these and simpler to just update to the latest versions. Since we stay on top of CVEs, we're almost always going to be higher than the versions requested by indirect dependencies, so it's mostly a moot issue and therefore not worth expending dev effort on. It's harmless to have a few extra replaces here. Compiling against newer K8s releases is where we end up doing most of the replacements. I've typically done this by going through the versions pulled in by the new K8s release and ensuring we replace to >= those versions if we're not already. |
Thanks for the prompt reply.
I just did an investigation to understand why we manually set github.com/imdario/mergo to v0.3.7 in indirect require entry instead of the version Go resolved. The indirect require entry But after we remove spark-on-k8s-operator, we didn't regenerate indirect require entry. In this case, I think the indirect require can be change to the v0.3.6. There are some other inconsistency. The review might be exhausting, but it should be an one-time job. |
So you're recommending that we downgrade to v0.3.6 of mergo when we've successfully tested against a newer version? There's likely to be some fixes in v0.3.7 that we have gotten the benefit of, even if everything still compiles and runs. The point is, this requires a lot of effort, that effort is not a one-time cost (any time dependencies are added or removed this may need to be done), and it's not obvious that it provides any net benefit. Let's just leave well enough alone. |
Yes, that's what I suggested. Everytime we made change on go.mod, we should:
Here is a tradeoff decision to be made here. Pros:
Cons:
|
Sorry, I have to respectfully disagree here. This is not only time consuming, but error prone, because every time this is done we need to audit for CVEs that might have been inadvertently reintroduced. This is not a hypothetical issue - it has happened on several occasions.
Why is this important? We override these to pull newer versions, and if those newer versions cause no issues, downgrading is pointless.
This isn't something we do, nor is there a central place to track this within the project. The very real cons of this approach significantly dwarf any of the perceived benefits. |
Understood, I can't find other reason to support it. For this PR, I'm not sure what should be fixed. I didn't manually change the indirect require entry. (It was change by For core and sheduler-interface, I've removed the golang.org/x modules in replace entry.
Should I add a commit to move it back? |
Put the x/lint replace back; also remove the comment from the other added replace.
Yes, please open JIRAs and PRs to replace the overrides that were already in place. |
@chenyulin0719 - The core / shim PRs have been. merged. In addition to the other changes requested, can you please update the scheduler-interface and core references to the latest? These are: core: v0.0.0-20240222210045-b926dce1f914 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending pre-commit checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending e2e tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What is this PR for?
Note: There are 2 vulnerable dependencies existing before this PR, and it can't be fixed by replace directives.
(From k8s.io/kubernetes@v1.29.2) (The packege is under public archive.)
This dependencies required fix from K8S.
What type of PR is it?
Todos
NA
What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-2416
How should this be tested?
E2E passed in my side:
https://github.com/chenyulin0719/yunikorn-k8shim/actions/runs/8004883059
Screenshots (if appropriate)
There are 2 existing CVE and can't be fixed by replace.
CVE check pass with govulncheck:
Questions:
NA