Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

notebook tests need to use a unique subdirectory for each run in the bucket #715

Closed
jlewi opened this issue Jul 2, 2020 · 3 comments
Closed

Comments

@jlewi
Copy link
Contributor

jlewi commented Jul 2, 2020

The notebook tests work by submitting a job to the KF cluster that runs the notebook using papermill.
The output notebook is then uploaded as html to a GCS bucket.

Right now ever run of the test would end up using the same directory/notebook file.

Example: GoogleCloudPlatform/kubeflow-distribution#65 (comment)

Looking at the logs the notebook is uploaded to

Uploading notebook to gs://kubeflow-ci-deployment_ci-temp/mnist_test

This path isn't unique for each run so we end up overwriting results.

The tricky part is that we need to set the directory consistently in two steps in the task

  1. We have the python test that fires of the job to run papermill
  2. The subsequent step which copies it to a different GCS bucket

I think what we could do is could mount in the pod labels. The pod labels will include the taskrun and pipeline run labels which we know will be unique.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
area/testing 0.75
kind/bug 0.67

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

@jlewi jlewi changed the title notebook tests need to add use a unique subdirectory for each run in the bucket notebook tests need to use a unique subdirectory for each run in the bucket Jul 8, 2020
jlewi pushed a commit to jlewi/testing that referenced this issue Jul 11, 2020
* When the notebook runs on a kubeflow-ci-deployment cluster it will
  upload the rendered notebook to a gcs bucket in kubeflow-ci-deployment.

* We need the GCS path for the notebook to be unique so that results
  from different runs don't overwrite each other.

Related to: kubeflow#715
jlewi pushed a commit to jlewi/testing that referenced this issue Jul 11, 2020
* When the notebook runs on a kubeflow-ci-deployment cluster it will
  upload the rendered notebook to a gcs bucket in kubeflow-ci-deployment.

* We need the GCS path for the notebook to be unique so that results
  from different runs don't overwrite each other.

Related to: kubeflow#715
k8s-ci-robot pushed a commit that referenced this issue Jul 14, 2020
* notebook tests need to use a unique subdirectory

* When the notebook runs on a kubeflow-ci-deployment cluster it will
  upload the rendered notebook to a gcs bucket in kubeflow-ci-deployment.

* We need the GCS path for the notebook to be unique so that results
  from different runs don't overwrite each other.

Related to: #715

* Fix subdir.

* Update notebook image.
@stale
Copy link

stale bot commented Oct 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in one week if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Oct 7, 2020
@stale
Copy link

stale bot commented Oct 15, 2020

This issue has been closed due to inactivity.

@stale stale bot closed this as completed Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant