Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kfctl go E2E test should deploy kubeflow #2705

Merged
merged 1 commit into from
Mar 16, 2019

Conversation

jlewi
Copy link
Contributor

@jlewi jlewi commented Mar 15, 2019

E2E test for kfctl should deploy kubeflow with basic auth

  • The E2E test attempts to call delete but delete isn't working.
  • kfctl delete fails because there is no kubeconfig file created.
  • The delete step is marked as an expected failure.

Improve logging in kfctl with go binary

  • Log the name and status of GCP DM operations
  • Print out DM creation errors so users see things like quota issues.

Fix #2706 - The coordinator commands should only invoke the commands apply/g
enerate/delete for the resources specified.
* There was a bug in the switch statements and we were always calling
generate/apply/delete for platform & k8s and not respecting the parameter

Attempt to fix #2706

  • We want to eliminate the need to talk to a K8s server when calling ks init

  • Hardcode the server address to 127.0.0.1

  • Hardcode K8s version to 1.11.7

  • The communication with the K8s server was coming because we were trying
    to talk to the K8s master to get the K8s version but we don't need to
    do that if we specify it.

    • Add filename log hook to kfctl so that we can emit the filename and line
      number where errors occur.

    • On init we Don't need to call GetConfig which tries to read the kubeconfig file on generate.

    Add retries to generate and apply because running under argo kfctl seems
    to randomly exit.

Related to #2610 E2E test for kfctl go binary


This change is Reviewable

@@ -487,13 +487,15 @@ func (ksApp *ksApp) Init(resources kftypes.ResourceEnum, options map[string]inte
func (ksApp *ksApp) initKs(config *rest.Config) error {
newRoot := path.Join(ksApp.KsApp.Spec.AppDir, ksApp.KsName)
ksApp.KsEnvName = kstypes.KsEnvName
k8sSpec := kftypes.GetServerVersion(kftypes.GetClientset(config))
// We hard code the K8s spec because we won't have a cluster to talk to when calling init.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add serverVersion to the ClientSpec then it could be set by the platform and the package managers would use this value rather than attempting to contacting the server by using credentials.

Similarly if the platform sets client credentials in the ClientSpec then the package managers would use this rather than $HOME/.kube/config's context.

This would align with how the platform updates other values in ClientSpec like components, componentParameters and the package managers use this updated ClientSpec.

I had discussed with @gabrielwen at the summit about this approach for fetching credentials.
I have the PR for just using the ClientSpec about ready. (#2691)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only ksonnet needs to know the K8s version and that's so it can generate an appropriate k8s.libsonnet file from the corresponding swagger spec for the K8s API.

I think we can just hard code it.

  • Only ksonnet needs to know the K8s API version; kustomize doesn't
    • ksonnet will likely be removed before the K8s API version is an issue
    • I suspect ksonnet might even generate a new lk8s lib file when a new environment is added (which we do later).

Similarly if the platform sets client credentials in the ClientSpec then the package managers would use this rather than $HOME/.kube/config's context.

Why would client credentials need to be set in ClientSpec? How is that different then using a .kubeconfig file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different platforms will set different credentials. If the package managers use the credentials set by the platform then we avoid package managers using .kube/config when they should be using the platform credentials.

@jlewi
Copy link
Contributor Author

jlewi commented Mar 15, 2019

When running kfctl under Argo as part of the test; I'm seeing random failures where it just appears to exit
on generate or apply. It doesn't seem like an error occurred or it was preempted.

If we look at the logs last line just indicates it was waiting for the deployment.

2019-03-15T14:32:04.473205048Z  main            util.py                     71 INFO     time="2019-03-15T14:31:40Z" level=warning msg="Deployment operation name: operation-1552660191586-58422df2d2a4e-6bf2031f-027c70b5 status: RUNNING" filename="gcp/gcp.go:331"

jlewi-kfctl-test-2705-0315-072101-3799861166.log.txt

jlewi-kfctl-test-2705-0315-072101-3799861166.event.log.txt

@jlewi
Copy link
Contributor Author

jlewi commented Mar 15, 2019

In the most recent run kfctl delete all failed because there is no kubeconfig file.

time="2019-03-15T17:48:33Z" level=fatal msg="could not open /mnt/test-data-volume/kubeflow-presubmit-kfctl-go-2705-05adf28-6330-dfdf/kfctl_test/.kube/kube
config Error stat /mnt/test-data-volume/kubeflow-presubmit-kfctl-go-2705-05adf28-6330-dfdf/kfctl_test/.kube/kubeconfig: no such file or directory" filenam
e="apps/group.go:180"

* The E2E test attempts to call delete but delete isn't working.

  * kfctl delete fails because there is no kubeconfig file created.
  * The delete step is marked as an expected failure.

Improve logging in kfctl with go binary
   * Log the name and status of GCP DM operations
   * Print out DM creation errors so users see things like quota issues.

Fix kubeflow#2706 - The coordinator commands should only invoke the commands apply/generate/delete for the resources specified.
  * There was a bug in the switch statements and we were always calling
    generate/apply/delete for platform & k8s and not respecting the parameter

Attempt to fix kubeflow#2706

  * We want to eliminate the need to talk to a K8s server when calling ks init
  * Hardcode the server address to 127.0.0.1
  * Hardcode K8s version to 1.11.7
  * The communication with the K8s server was coming because we were trying
    to talk to the K8s master to get the K8s version but we don't need to
    do that if we specify it.

* Add filename log hook to kfctl so that we can emit the filename and line
  number where errors occur.

* On init we Don't need to call GetConfig which tries to read the kubeconfig file on generate.

Add retries to generate and apply because running under argo kfctl seems
to randomly exit.
@jlewi jlewi force-pushed the kfctl_go_test_app branch from 0a6b554 to ed8b484 Compare March 15, 2019 22:16
@jlewi jlewi changed the title [WIP] Kfctl go E2E test should deploy kubeflow Kfctl go E2E test should deploy kubeflow Mar 15, 2019
@jlewi
Copy link
Contributor Author

jlewi commented Mar 15, 2019

/assign @gabrielwen @kkasravi

@kkasravi
Copy link
Contributor

/lgtm

@kkasravi
Copy link
Contributor

/approve

1 similar comment
@jlewi
Copy link
Contributor Author

jlewi commented Mar 16, 2019

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi, kkasravi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit b0c20bb into kubeflow:master Mar 16, 2019
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021
)

* The E2E test attempts to call delete but delete isn't working.

  * kfctl delete fails because there is no kubeconfig file created.
  * The delete step is marked as an expected failure.

Improve logging in kfctl with go binary
   * Log the name and status of GCP DM operations
   * Print out DM creation errors so users see things like quota issues.

Fix kubeflow#2706 - The coordinator commands should only invoke the commands apply/generate/delete for the resources specified.
  * There was a bug in the switch statements and we were always calling
    generate/apply/delete for platform & k8s and not respecting the parameter

Attempt to fix kubeflow#2706

  * We want to eliminate the need to talk to a K8s server when calling ks init
  * Hardcode the server address to 127.0.0.1
  * Hardcode K8s version to 1.11.7
  * The communication with the K8s server was coming because we were trying
    to talk to the K8s master to get the K8s version but we don't need to
    do that if we specify it.

* Add filename log hook to kfctl so that we can emit the filename and line
  number where errors occur.

* On init we Don't need to call GetConfig which tries to read the kubeconfig file on generate.

Add retries to generate and apply because running under argo kfctl seems
to randomly exit.
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 12, 2021
)

* The E2E test attempts to call delete but delete isn't working.

  * kfctl delete fails because there is no kubeconfig file created.
  * The delete step is marked as an expected failure.

Improve logging in kfctl with go binary
   * Log the name and status of GCP DM operations
   * Print out DM creation errors so users see things like quota issues.

Fix kubeflow#2706 - The coordinator commands should only invoke the commands apply/generate/delete for the resources specified.
  * There was a bug in the switch statements and we were always calling
    generate/apply/delete for platform & k8s and not respecting the parameter

Attempt to fix kubeflow#2706

  * We want to eliminate the need to talk to a K8s server when calling ks init
  * Hardcode the server address to 127.0.0.1
  * Hardcode K8s version to 1.11.7
  * The communication with the K8s server was coming because we were trying
    to talk to the K8s master to get the K8s version but we don't need to
    do that if we specify it.

* Add filename log hook to kfctl so that we can emit the filename and line
  number where errors occur.

* On init we Don't need to call GetConfig which tries to read the kubeconfig file on generate.

Add retries to generate and apply because running under argo kfctl seems
to randomly exit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kfctl tries to call ks init even on ks generate platform
5 participants