Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation error when installing Trident plug-in in Private registry environment #215

Closed
zhenyongnetapp opened this issue Feb 22, 2019 · 15 comments

Comments

@zhenyongnetapp
Copy link

[Problem]

  • Occurred Installation error when installing Trident plug-in in Private registry environment.
  • The environment being installed by the customer is a Private environment that is not Internet-enabled.
  • So, we suggested an offline installation way as below.
  • The Private registry of the trident installation video is an un-authentication basis, but the customer's Private registry is a Authentication base.

[Providing guides]
• follow below youtube link offline install trident,after pulling two docker image (netapp/trident:18.10.0, quay.io/coreos/etcd:v3.2.19) to local repository then install with below command.
• e) # ./tridentctl install -n trident --trident-image tacorepo:5000/netapp/trident:18.10.0 --etcd-image tacorepo:5000/quay.io/coreos/etcd:v3.2.19 -d
https://www.youtube.com/watch?v=_T3_JntptKA

[Customer Action#1]
• follow below install guide while generate error。
o https://netapp.io/2018/12/19/installing-trident-from-a-private-registry/
• follow above URL link but failed during generate POD。

• 1) after run install command - stuck in Waiting for Trident installer pod to start. then generate error
• [root@k8s-master trident-installer]# ./tridentctl install -n trident --trident-image sds.redii.net/jamessc-cho/trident:18.10.0 --etcd-image sds.redii.net/jamessc-cho/etcd:v3.3.9
INFO Created namespace. namespace=trident
INFO Created installer service account. serviceaccount=trident-installer
INFO Created installer cluster role. clusterrole=trident-installer
INFO Created installer cluster role binding. clusterrolebinding=trident-installer
INFO Created installer configmap. configmap=trident-installer
INFO Created installer pod. pod=trident-installer
INFO Waiting for Trident installer pod to start.
ERRO Trident installer pod was not started after 180.00 seconds. Pod status is Pending. Use 'kubectl describe pod trident-installer -n trident' for more information.
INFO Deleted installer cluster role binding.
INFO Deleted installer cluster role.
INFO Deleted installer service account.
FATA Install failed; pod not yet started (Pending). Resolve the issue; use 'tridentctl uninstall' to clean up; and try again.

• 2) Pod state is ImagePullBackOff
[root@k8s-master ~]# kubectl get pod -n trident -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
trident-installer 0/1 ImagePullBackOff 0 10m 10.44.0.14 k8s-worker1

• 3) when check POD event can found error “no basic auth credentials”
• [root@k8s-master ~]# kubectl describe pod trident-installer -n trident
• (.... skip ....)
• Events:
Type Reason Age From Message


Normal Scheduled 10m default-scheduler Successfully assigned trident/trident-installer to k8s-worker1
Normal Pulling 8m43s (x4 over 10m) kubelet, k8s-worker1 pulling image "sds.redii.net/jamessc-cho/trident:18.10.0"
Warning Failed 8m43s (x4 over 10m) kubelet, k8s-worker1 Failed to pull image "sds.redii.net/jamessc-cho/trident:18.10.0": rpc error: code = Unknown desc = Error response from daemon: Gethttps://sds.redii.net/v2/jamessc-cho/trident/manifests/18.10.0: no basic auth credentials
Warning Failed 8m43s (x4 over 10m) kubelet, k8s-worker1 Error: ErrImagePull
Normal BackOff 8m19s (x6 over 10m) kubelet, k8s-worker1 Back-off pulling image "sds.redii.net/jamessc-cho/trident:18.10.0"
Warning Failed 4m59s (x20 over 10m) kubelet, k8s-worker1 Error: ImagePullBackOff

[Customer Action #2]
• Attached YAML file that is used by customer during installation (The YAML file is to pull the image to create the POD in customer’s Private Registry environment)
• follow below link install trident with YAML file, URL “4. Install Trident – Customized Installation”
o https://netapp-trident.readthedocs.io/en/stable-v19.01/kubernetes/troubleshooting.html#troubleshooting
Step1: --generate-custom-yaml option generate yaml file then below yaml file will generate.
• [root@k8s-master ~]# ll /tmp/trident-installer/setup/
합계 28
-rwxrwxrwx 1 1001 1001 206 10월 23 06:01 backend.json
-rwxrwxrwx 1 root root 728 2월 20 02:59 trident-clusterrole.yaml
-rwxrwxrwx 1 root root 255 2월 20 02:59 trident-clusterrolebinding.yaml
-rwxrwxrwx 1 root root 1926 2월 20 04:23 trident-deployment.yaml
-rwxrwxrwx 1 root root 61 2월 20 02:59 trident-namespace.yaml
-rwxrwxrwx 1 root root 297 2월 20 02:59 trident-pvc.yaml
-rwxrwxrwx 1 root root 66 2월 20 02:59 trident-serviceaccount.yaml
• Step2: change trident-deployment.yaml file which match private registry (attach yaml file)
• what
trident-deployment.zip
have changed
imagePullSecrets:
- name: redii
image: sds.redii.net/jamessc-cho/trident:18.10.0
image: sds.redii.net/jamessc-cho/etcd:v3.3.9

@zhenyongnetapp
Copy link
Author

hi expert

same body help check above trident install issue?

thanks

frank

@adkerr
Copy link
Contributor

adkerr commented Feb 22, 2019

If the private registry requires login to pull then the customer will need to either login before running install (if the login is persistent) or they will need to manually pull the images to their local docker cache prior to running trident install

@iamhyuk
Copy link

iamhyuk commented Feb 27, 2019

@adkerr
Thank you for your reply. The following is my customer's feedback for your comment.

Pre-login is a possible way in a docker. In k8s, login authentication information is managed as "imagePulseSecret" and dynamically transmits this information to the docker when running the pod, so even if it is docker login, it ignores it.
And when install with trident CLI, imagePulpolicy value has been "Always". So even if push the image in advance in LOCAL, always check the remote registry, so authentication errors will also occur.

My customer thinks this issue as below.
First of all, all of the above issues were the problems when filled out a specific image address with the CLI parameter as follows.
#> ./tridentl install -n trident --trident-image xxx --etcd-image xxxx

If I can run with custom yaml, I can modify all the problematic parts and reflect them, so it will be solved if you guide the correct way to run with custom yaml.

Thanks
Hendrick

@ntap-rippy
Copy link
Contributor

This document goes into the customized installation options available:
https://netapp-trident.readthedocs.io/en/stable-v19.01/kubernetes/deploying.html#customized-installation

Use tridentctl's --generate-custom-yaml option to create the YAML files. Edit them, then use tridentctl's --use-custom-yaml option to install a customized version of Trident.

$ ./tridentctl install -n trident --generate-custom-yaml
INFO Wrote installation YAML files.                setupPath=/tmp/trident-installer/setup
$ vi /tmp/trident-installer/setup/trident-deployment.yaml 
$ ./tridentctl install -n trident --use-custom-yaml 

@iamhyuk
Copy link

iamhyuk commented Feb 28, 2019

Occured error as below.

  1. Create to generate-custom-yaml
    root@master:/root/trident-installer$ ./tridentctl install -n trident --generate-custom-yaml
    INFO Wrote installation YAML files. setupPath=/root/trident-installer/setup

  2. change to "private image" in custom-yaml
    root@master:/root/trident-installer$ cat /root/trident-installer/setup/trident-deployment.yaml | grep image:
    image: sds.redii.net/sdspaas/trident:19.01.0
    image: sds.redii.net/sdspaas/etcd:v3.3.10

  3. install with custom-yaml (failed)
    root@master:/root/trident-installer$ ./tridentctl install -n trident --use-custom-yaml
    INFO Created namespace. path=/root/trident-installer/setup/trident-namespace.yaml
    INFO Created installer service account. serviceaccount=trident-installer
    INFO Waiting for object to be created. objectName=clusterRole
    INFO Created installer cluster role. clusterrole=trident-installer
    INFO Waiting for object to be created. objectName=clusterRoleBinding
    INFO Created installer cluster role binding. clusterrolebinding=trident-installer
    INFO Created installer configmap. configmap=trident-installer
    INFO Waiting for object to be created. objectName=installerPod
    INFO Created installer pod. pod=trident-installer
    INFO Waiting for Trident installer pod to start.
    ERRO Trident installer pod was not started after 180.00 seconds. Pod status is Pending. Use 'kubectl describe pod trident-installer -n trident' for more information.

  4. Inquiry Trident Pod (did not change as "private image")
    root@master:/root/trident-installer$ kubectl get po -n trident
    NAME READY STATUS RESTARTS AGE
    trident-installer 0/1 ErrImagePull 0 2m10s
    root@master:/root/trident-installer$ kubectl get po -n trident trident-installer -oyaml | grep image:
    image: netapp/trident:19.01.0

  • image: netapp/trident:19.01.0
    root@master:/root/trident-installer$ kubectl describe po trident-installer -n trident
    Events:
    Type Reason Age From Message

Normal Scheduled 3m56s default-scheduler Successfully assigned trident/trident-installer to node2
Normal Pulling 3m55s kubelet, node2 pulling image "netapp/trident:19.01.0"
Warning Failed 3m40s kubelet, node2 Failed to pull image "netapp/trident:19.01.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 3m40s kubelet, node2 Error: ErrImagePull

@kangarlou
Copy link
Contributor

@iamhyuk Changing the image in trident-deployment.yaml doesn't change the image for the containerized installer and that's the problem you're hitting. However, using ./tridentl install -n trident --trident-image xxx --etcd-image xxxx changes both Trident deployment image and the containerized installer image.

@clintonk, is there a way to customize the containerized installer without changing the source code?

@clintonk
Copy link
Contributor

Trident's custom installation method (--generate-custom-yaml and --use-custom-yaml) should work in this case. All of the custom YAML files are passed into the containerized installer in a configmap. But passing the image names to the installer should work equally well.

@kangarlou
Copy link
Contributor

@clintonk The problem is changing the image in trident-deployment.yaml has no impact on the image used by the containerized installer itself unless --trident-image is used:

k8sclient.GetInstallerPodYAML(TridentInstallerLabelValue, tridentImage, commandArgs), errMessage)
.

It seems you have to customize the yamls AND specify the command line option.

@clintonk
Copy link
Contributor

@kangarlou Good point, you're correct about that. Trident and its installer use the very same image, but one must invoke the installer with --trident-image which serves both purposes.

@jacobjohnanda
Copy link

@clintonk They would like to use the "--use-custom-yaml" to pass on the ImagePullSecret. So do you think if they use the customized yaml file and the "--trident-image" command line , the install should go through successfully?

@clintonk
Copy link
Contributor

clintonk commented Mar 4, 2019

@jacobjohnanda I don't see why not.

@cloudbadmin
Copy link

cloudbadmin commented Mar 6, 2019

@clintonk

NAME READY STATUS RESTARTS AGE
trident-installer 0/1 Error 0 7s
Error: unknown flag: --trident-image

This is in a cluster that cannot access the internet, and needs to pull images from a priv repo that requires login. The command I used to get this error is ./tridentctl install -n trident --use-custom-yaml --trident-image /pub/netapp/trident:19.01.0 --etcd-image /pub/netapp/etcd:v3.3.12

Normally, the repo requires an imagepullsecret. But, they dont seem to work with the "installer". I get the error "repository does not exist or may require 'docker login'"
I've tried having a yaml file in the setup dir with the username/pw as base64 and I have created the secret in the namespace and added it to the serviceaccount yaml in the setup dir. Neither of these work.

Before that, I tried caching the images on each of my workers and some other tagging trickery, but the "installer" kept trying to reach out to docker.io

Specifying the tridnet and etcd images seems like its the right work around for all of this. Can you elaborate on what unknown flag: --trident-image means, and how to get around?

@jacobjohnanda
Copy link

jacobjohnanda commented Mar 7, 2019

I am just guessing "netapp/trident:19.01.0" format is causing the build to be pulled from docker.io. Could you try the format [registry hostname]:[registry port]/trident:19.01.0 .

@cloudbadmin
Copy link

cloudbadmin commented Mar 7, 2019

Sorry, I had removed the domain names for the purpose of posting here. I am using the full url. I was able to get this installed this after doing a couple things. First was going back to 18.10. I modified the 19.01 image so that I could exec in there and see what the hang up was. It looks like the tridentctl binary does not have the flags that the 18.10 binary had. Specifically, 19.01 is missing --trident-image and --etcd-image.

Next is security stuff related to my cluster. I noticed the installer and the actual trident container have their own labels. So, I needed a selector for each in my network policies. Then I had to put the images in a non-password protected area of my repo.

@jacobjohnanda
Copy link

I did give a try and I was able to install using the --trident-image with Trident 19.01 installer.

@innergy innergy closed this as completed Mar 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants