Trident should create a unique SCC when deploying itself to OpenShift #374

nccurry · 2020-04-08T14:13:20Z

Describe the bug
As of OpenShift 4.3.8 modifying default SCC objects (including adding arbitrary users and service accounts) will block cluster upgrades.

The trident installer should create a separate SCC, instead of assigning itself to the default 'privileged' SCC, that contains just the permissions it needs to function.

https://bugzilla.redhat.com/show_bug.cgi?id=1821905
https://bugzilla.redhat.com/show_bug.cgi?id=1818893

Environment
Provide accurate information about the environment to help us reproduce the issue.

Trident version: 20.01.1
Trident installation flags used: default
Container runtime: crio
Kubernetes version: 1.16.2
Kubernetes orchestrator: OpenShift 4.3.8
Kubernetes enabled feature gates: Default
OS: Red Hat CoreOS 43
NetApp backend types: Azure File, ONTAP Nas
Other:

To Reproduce
Deploy Trident into OpenShift 4.3.8 cluster
Attempt to upgrade OpenShift 4.3.8 -> 4.3.9

Expected behavior
OpenShift cluster upgrades

Additional context

markandrewj · 2020-04-09T17:37:46Z

Hello, I just wanted to say I am currently using OpenShift in a corporate environment, and we are affected by this at the moment.

nccurry · 2020-04-09T18:02:03Z

@markandrewj You can work around the issue by temporarily removing the Trident service account from the scc. Once the upgrade has started you can add it back.

markandrewj · 2020-04-09T20:44:26Z

@nccurry We gave what you suggested a try, and the upgrade still wouldn't progress for us unfortunately.

nccurry · 2020-04-09T20:48:53Z

Try reinitiating the upgrade through either the web console or oc adm upgrade --to="4.3.9"

markandrewj · 2020-04-09T20:55:24Z

We tried this too unfortunately.

$ oc adm upgrade --allow-upgrade-with-warnings --to 4.3.9
Updating to 4.3.9

$ oc get clusterversion -o json|jq ".items[0].status.history"
[
  {
    "completionTime": null,
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:f0fada3c8216dc17affdd3375ff845b838ef9f3d67787d3d42a88dcd0f328eea",
    "startedTime": "2020-04-09T20:24:11Z",
    "state": "Partial",
    "verified": false,
    "version": "4.3.9"
  }

gnarl · 2020-04-09T22:21:51Z

We will have a fix out for this with the Trident 20.04 release due at the end of the month.

megabreit · 2020-04-09T23:40:04Z

@markandrewj When the upgrade does not continue, you need to wait a little longer and maybe restart as suggested. Put the trident user back into the SCC when all 3 kube-* operators are in the state "Updating". I did this 2 times with 4.3.8->4.3.9 and with 4.3.9->4.3.10. It took about 2-3 mins after the update was triggered.
Thinking about it: Maybe a better workaround would be to create a clone of the privileged SCC, add the trident user to it and remove it from privileged SCC.

markandrewj · 2020-04-10T01:00:53Z

I just waned to say thanks to everyone for replying so quickly. My colleagues, and myself, will be able to look into this again next week. If what was suggested doesn't work, waiting for the new release would be ok too.

eparis · 2020-04-14T13:20:12Z

We (OpenShift) will be working to fix this in 4.3.13 and GREATLY apologize for our screw up. It may still require --force to get to 4.3.13. This is being very actively investigated on our side.

We do suggest that Trident move to using RBAC to access SCCs but we should not have broken what was working. We greatly appreciate your work to help address our (joint) customer's issues.

markandrewj · 2020-04-14T13:37:39Z

@eparis Although it is unfortunate we hit this issue, I am happy to see active development around Trident. We started using it in OpenShift 3, and I had some concern that it was going to fall to the wayside in OpenShift 4. Thanks for the help, and keep up the good work.

eparis · 2020-04-14T13:49:57Z

FYI you should be able to run
oc adm upgrade --force --to=4.3.9
If it complains that it's already at 4.3.9 you might need to run
oc adm upgrade --clear
Then try again with --force

markandrewj · 2020-04-14T17:40:38Z

Thanks for the information, we will give it a try. Just out of curiosity, are there any plans to turn trident into an operator? It works pretty well at the moment, so I don't know how much value there would be, but it seems to be the trend at the moment.

markandrewj · 2020-04-14T20:14:41Z

We tried what was suggested this afternoon, and our cluster is upgraded now. Thank you to everyone for the help.

gnarl · 2020-04-14T20:19:13Z

@markandrewj We are planning to have an Operator in the 20.04 release. We will also keep the existing installer until we are sure the Operator has all the needed functionality.

gnarl · 2020-04-28T19:43:50Z

This was fixed in the Trident 20.04 release.

markandrewj · 2020-04-28T20:32:55Z

Awesome guys! Thanks.

nccurry added the bug label Apr 8, 2020

gnarl added the tracked label Apr 8, 2020

gnarl closed this as completed Apr 28, 2020

clintonk mentioned this issue May 6, 2020

Trident 20.01 on OpenShift modifies the 'privileged' SCC #396

Closed

netapp-ci pushed a commit that referenced this issue Aug 19, 2020

Return success if targetPath does not exist in NodeUnpublish (#374)

f59a30b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trident should create a unique SCC when deploying itself to OpenShift #374

Trident should create a unique SCC when deploying itself to OpenShift #374

nccurry commented Apr 8, 2020 •

edited

Loading

markandrewj commented Apr 9, 2020

nccurry commented Apr 9, 2020 •

edited

Loading

markandrewj commented Apr 9, 2020

nccurry commented Apr 9, 2020

markandrewj commented Apr 9, 2020

gnarl commented Apr 9, 2020

megabreit commented Apr 9, 2020

markandrewj commented Apr 10, 2020

eparis commented Apr 14, 2020 •

edited

Loading

markandrewj commented Apr 14, 2020

eparis commented Apr 14, 2020

markandrewj commented Apr 14, 2020

markandrewj commented Apr 14, 2020 •

edited

Loading

gnarl commented Apr 14, 2020

gnarl commented Apr 28, 2020

markandrewj commented Apr 28, 2020

Trident should create a unique SCC when deploying itself to OpenShift #374

Trident should create a unique SCC when deploying itself to OpenShift #374

Comments

nccurry commented Apr 8, 2020 • edited Loading

markandrewj commented Apr 9, 2020

nccurry commented Apr 9, 2020 • edited Loading

markandrewj commented Apr 9, 2020

nccurry commented Apr 9, 2020

markandrewj commented Apr 9, 2020

gnarl commented Apr 9, 2020

megabreit commented Apr 9, 2020

markandrewj commented Apr 10, 2020

eparis commented Apr 14, 2020 • edited Loading

markandrewj commented Apr 14, 2020

eparis commented Apr 14, 2020

markandrewj commented Apr 14, 2020

markandrewj commented Apr 14, 2020 • edited Loading

gnarl commented Apr 14, 2020

gnarl commented Apr 28, 2020

markandrewj commented Apr 28, 2020

nccurry commented Apr 8, 2020 •

edited

Loading

nccurry commented Apr 9, 2020 •

edited

Loading

eparis commented Apr 14, 2020 •

edited

Loading

markandrewj commented Apr 14, 2020 •

edited

Loading