Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Most public images used in devfile registry do not work as the base for Che 7 workspaces on OpenShift #13454

Closed
amisevsk opened this issue May 30, 2019 · 15 comments
Assignees
Labels
kind/bug Outline of a bug - must adhere to the bug report template. severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. status/code-review This issue has a pull request posted for it and is awaiting code review completion by the community. status/in-progress This issue has been taken by an engineer and is under active development.
Milestone

Comments

@amisevsk
Copy link
Contributor

Description

OpenShift by default starts containers using a random UID for security purposes [1]. Since the UID to be used is not known at container creation time, there is no entry for the running user in /etc/passwd.

This can cause a multitude of problems:

  • The terminal is buggy: tab completion does not work, pressing up arrow does not work
    • Default terminal is always /bin/sh; starting bash results in the prompt
      I have no name!@workspaceid $
    • User does not have a home directory, and terminal does not open to /projects
  • Any program that depends on user id or home directory can have issues
  • Some configuration that relies on having a home directory is impossible, and behaviour in these cases is generally unpredictable.

The suggestions in [1] are

  • Any directory that needs to be accessed must have read/write permissions for the root group. This is not a problem for directories mounted as volumes
  • The /etc/passwd file should be writable by the root group. Then, an entrypoint script can execute something like
    if ! whoami &> /dev/null; then
      if [ -w /etc/passwd ]; then
        echo "${USER_NAME:-default}:x:$(id -u):0:${USER_NAME:-default} user:${HOME}:/sbin/nologin" >> /etc/passwd
      fi
    fi
    to set the current user correctly.

The second point above is the issue -- images not created to run on OpenShift do not have a writeable /etc/passwd; further, since we normally overwrite container commands with sleep infinity or tail -f /dev/null, even images with an entrypoint that does what we need won't execute the script by default. Note that this is what's done when starting the default "Che 7" stack, which explains why it works.

At this time, I don't see many good potential solutions; when running on OpenShift, the Che server could attempt to rewrite the recipe's container commands with the script above followed by whichever non-terminating command we like, but this would still require using compatible images in the devfile registry. It could also result in confusing errors if we're automatically rewriting entrypoints, and it's difficult to specify a script like above in the yaml list format used for container commands.

I'm currently working with trying to implement the above, but if anyone has better ideas, it'd be great to hear them.

[1]- https://docs.okd.io/3.11/creating_images/guidelines.html#openshift-specific-guidelines

Reproduction Steps

Start a Che 7 workspace other than the default dev image (e.g. java-maven), open a terminal, and type whoami

OS and version:
Che 7 on any OpenShift

Diagnostics:
In most images, we have e.g.:

$ id
uid=1075080000 gid=0(root) groups=0(root),1075080000
@amisevsk amisevsk added the kind/bug Outline of a bug - must adhere to the bug report template. label May 30, 2019
@amisevsk
Copy link
Contributor Author

cc: @l0rd I'd like your thoughts since you're familiar with these issues.

@l0rd
Copy link
Contributor

l0rd commented May 31, 2019

@amisevsk great analysis! And that's super important for the UX hence I am happy that you have dug the problem. Here are my comments:

We should use openshift-compatible images only (group root can write /etc/passwd)
In the last month or so we have started trying to move to openshift compatible images only, even upstream (c.f. my comment here). This has ended looking at stacks in https://github.com/sclorg but it doesn't look that these will satisfy all our use cases. The other promising image catalog is the UBI images one. Anyway the images we use upstream should all be OpenShift compatible: that's the primary requirement. Other requirements are that they should be community supported and actively maintained. There may be cases where we don't find an image that satisfy all those requirements. In this case we can still open a PR to make the community supported image compatible with OpenShift or, if we really have no alternative, build it an maintain it ourselves in eclipse/che-dockerfiles. But ideally I would like eclipse/che-dockerfiles to become empty 😃

Overriding the entrypoint to set patch /etc/passwd
I think it's great if you we can add that in the wsmaster logic. But that should be something optional. In other words we should be able to specify in the devfile if we are going to ovewrite /etc/passwd or not. Like having a new arbitraryUserSupport attributes for dockerimage and kubernetes components:

  - type: dockerimage
    image: centos:centos7
    command: ["fail"]
    args: ["-f", "/dev/null"]
    arbitraryUserSupport: true
    memoryLimit: 50Mi

@l0rd l0rd mentioned this issue Jun 4, 2019
@tsmaeder
Copy link
Contributor

tsmaeder commented Jun 6, 2019

Maybe I'm naive here, but to me, this sounds like a problem in OpenShift. The problem is not that they run the container as a random user, but that this user does not conform to the expectations unix programs make of a user.

In an ideal world, OpenShift would fix this. Do we have an issue open with them and is there a response?

@ibuziuk
Copy link
Member

ibuziuk commented Jun 6, 2019

@tsmaeder the thing is that these problems are fixed on S2I images like openjdk-8-rhel8 [1]. But those are located in the private registry.redhat.io

image

[1] https://access.redhat.com/containers/?tab=overview&get-method=unauthenticated#/registry.access.redhat.com/openjdk/openjdk-8-rhel8

@l0rd l0rd mentioned this issue Jun 27, 2019
85 tasks
@l0rd l0rd added severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. target/che7GA and removed target/che7GA labels Jun 27, 2019
@l0rd l0rd added this to the 7.0.0 milestone Jun 27, 2019
@ibuziuk
Copy link
Member

ibuziuk commented Jul 10, 2019

Initial PR with patched Dockerfile has been merged to devfile registry eclipse-che/che-devfile-registry#24

@amisevsk
Copy link
Contributor Author

PR patching all currently used images (except for one based off alpine): eclipse-che/che-devfile-registry#38

@ibuziuk
Copy link
Member

ibuziuk commented Jul 22, 2019

@amisevsk is there some issue with patching 'alpine' based community images or it is just not yet done?

@amisevsk
Copy link
Contributor Author

@ibuziuk There is, but sadly the discussion is split over 4-5 PRs/issues.

The alpine image does not ship with bash installed, so the current patch breaks the image (sets default login shell for user to /bin/bash. The three options for fixing this are

  • Treat the angular devfile as a special case in patching (use /bin/sh instead of bash)
  • Switch the angular devfile to use the debian-based variant instead of alpine. Works but the image ends up significantly larger
    • This may not actually be an issue since most debian layers should be cached from other images
  • Create a new base image that installs bash on top of the default alpine-based image and use that as our base image in patching.

I think the second option (move to debian) is cleanest, but I haven't tested or checked if there are further issues. It shouldn't cause problems since we already use debian-based devfiles elsewhere.

@l0rd l0rd added status/in-progress This issue has been taken by an engineer and is under active development. severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. and removed severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. target/che7GA labels Jul 22, 2019
@ibuziuk
Copy link
Member

ibuziuk commented Jul 22, 2019

@l0rd are you, in general, ok to switch the alpine images to debian in the devfile registry for Che 7 GA and do the arbitary user patch the same way as it is currently done for all the other images?

@amisevsk
Copy link
Contributor Author

@ibuziuk, speaking of split discussion: eclipse-che/che-devfile-registry#38 (comment)

I'll open a PR to use a more convenient base image.

@l0rd
Copy link
Contributor

l0rd commented Jul 22, 2019 via email

@l0rd l0rd added the status/code-review This issue has a pull request posted for it and is awaiting code review completion by the community. label Jul 23, 2019
@ibuziuk
Copy link
Member

ibuziuk commented Jul 24, 2019

@amisevsk @l0rd I believe all the images required for GA have been patched already. Can we close this issue now ?

@amisevsk
Copy link
Contributor Author

As a blocker for 7.0.0 I believe this issue can be closed, but in opening it I initially intended it to be solved by a more general solution (public images still do not work, and users will still run into problems trying to create their own devfiles -- we just have a workaround that fixes our use-case).

@ibuziuk ibuziuk changed the title Most public images do not work as the base for Che 7 workspaces on OpenShift Most public images used in devfile registry do not work as the base for Che 7 workspaces on OpenShift Jul 24, 2019
@ibuziuk
Copy link
Member

ibuziuk commented Jul 24, 2019

@amisevsk I reworded this issue a bit to make it Che 7 specific. WDYT about closing this one and creating a separate one for the general use-case of public images as a language runtime on OpenShift?
I believe we need to cross-link this new general issue with openshift/origin#23369

@ibuziuk
Copy link
Member

ibuziuk commented Jul 26, 2019

a separate issue for general use of language runtime images without patching on OpenShift v4 has been created - #14042

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Outline of a bug - must adhere to the bug report template. severity/blocker Causes system to crash and be non-recoverable or prevents Che developers from working on Che code. status/code-review This issue has a pull request posted for it and is awaiting code review completion by the community. status/in-progress This issue has been taken by an engineer and is under active development.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants