Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure samples #1710

Merged
merged 7 commits into from
Aug 2, 2019
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion backend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ COPY ./samples .
#I think it's better to just use a shell loop though.
#RUN for pipeline in $(find . -maxdepth 2 -name '*.py' -type f); do dsl-compile --py "$pipeline" --output "$pipeline.tar.gz"; done
#The "for" loop breaks on all whitespace, so we either need to override IFS or use the "read" command instead.
RUN find . -maxdepth 2 -name '*.py' -type f | while read pipeline; do dsl-compile --py "$pipeline" --output "$pipeline.tar.gz"; done
RUN find . -maxdepth 3 -name '*.py' -type f | while read pipeline; do dsl-compile --py "$pipeline" --output "$pipeline.tar.gz"; done

FROM debian:stretch

Expand Down
24 changes: 12 additions & 12 deletions backend/src/apiserver/config/sample_config.json
Original file line number Diff line number Diff line change
@@ -1,32 +1,32 @@
[
{
"name":"[Sample] ML - XGBoost - Training with Confusion Matrix",
"description":"A trainer that does end-to-end distributed training for XGBoost models. For source code, refer to https://github.com/kubeflow/pipelines/tree/master/samples/xgboost-spark",
"file":"/samples/xgboost-spark/xgboost-training-cm.py.tar.gz"
"description":"A trainer that does end-to-end distributed training for XGBoost models. For source code, refer to https://github.com/kubeflow/pipelines/tree/master/samples/core/xgboost-spark",
"file":"/samples/core/xgboost-spark/xgboost-training-cm.py.tar.gz"
},
{
"name":"[Sample] ML - TFX - Taxi Tip Prediction Model Trainer",
"description":"Example pipeline that does classification with model analysis based on a public tax cab BigQuery dataset. For source code, refer to https://github.com/kubeflow/pipelines/tree/master/samples/tfx",
"file":"/samples/tfx/taxi-cab-classification-pipeline.py.tar.gz"
"description":"Example pipeline that does classification with model analysis based on a public tax cab BigQuery dataset. For source code, refer to https://github.com/kubeflow/pipelines/tree/master/samples/core/tfx",
"file":"/samples/core/tfx/taxi-cab-classification-pipeline.py.tar.gz"
},
{
"name":"[Sample] Basic - Sequential execution",
"description":"A pipeline with two sequential steps. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/basic/sequential.py",
"file":"/samples/basic/sequential.py.tar.gz"
"description":"A pipeline with two sequential steps. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/core/sequential/sequential.py",
"file":"/samples/core/sequential/sequential.py.tar.gz"
},
{
"name":"[Sample] Basic - Parallel execution",
"description":"A pipeline that downloads two messages in parallel and prints the concatenated result. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/basic/parallel_join.py",
"file":"/samples/basic/parallel_join.py.tar.gz"
"description":"A pipeline that downloads two messages in parallel and prints the concatenated result. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/core/parallel_join/parallel_join.py",
"file":"/samples/core/parallel_join/parallel_join.py.tar.gz"
},
{
"name":"[Sample] Basic - Conditional execution",
"description":"A pipeline shows how to use dsl.Condition. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/basic/condition.py",
"file":"/samples/basic/condition.py.tar.gz"
"description":"A pipeline shows how to use dsl.Condition. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/core/condition/condition.py",
"file":"/samples/core/condition/condition.py.tar.gz"
},
{
"name":"[Sample] Basic - Exit Handler",
"description":"A pipeline that downloads a message and prints it out. Exit Handler will run at the end. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/basic/exit_handler.py",
"file":"/samples/basic/exit_handler.py.tar.gz"
"description":"A pipeline that downloads a message and prints it out. Exit Handler will run at the end. For source code, refer to https://github.com/kubeflow/pipelines/blob/master/samples/core/exit_handler/exit_handler.py",
"file":"/samples/core/exit_handler/exit_handler.py.tar.gz"
}
]
32 changes: 32 additions & 0 deletions samples/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,35 @@
The sample pipelines give you a quick start to building and deploying machine learning pipelines with Kubeflow.
* Follow the guide to [deploy the Kubeflow pipelines service](https://www.kubeflow.org/docs/guides/pipelines/deploy-pipelines-service/).
* Build and deploy your pipeline [using the provided samples](https://www.kubeflow.org/docs/guides/pipelines/pipelines-samples/).




This page tells you how to use the _basic_ sample pipelines contained in the repo.

## Compile the pipeline specification

Follow the guide to [building a pipeline](https://www.kubeflow.org/docs/guides/pipelines/build-pipeline/) to install the Kubeflow Pipelines SDK and compile the sample Python into a workflow specification. The specification takes the form of a YAML file compressed into a `.tar.gz` file.
Ark-kun marked this conversation as resolved.
Show resolved Hide resolved

For convenience, you can download a pre-compiled, compressed YAML file containing the
specification of the `sequential.py` pipeline. This saves you the steps required
numerology marked this conversation as resolved.
Show resolved Hide resolved
to compile and compress the pipeline specification:
[sequential.tar.gz](https://storage.googleapis.com/sample-package/sequential.tar.gz)

## Deploy

Open the Kubeflow pipelines UI, and follow the prompts to create a new pipeline and upload the generated workflow
specification, `my-pipeline.tar.gz` (example: `sequential.tar.gz`).

## Run

Follow the pipeline UI to create pipeline runs.

Useful parameter values:

* For the "exit_handler" and "sequential" samples: `gs://ml-pipeline-playground/shakespeare1.txt`
* For the "parallel_join" sample: `gs://ml-pipeline-playground/shakespeare1.txt` and `gs://ml-pipeline-playground/shakespeare2.txt`

## Components source

All samples use pre-built components. The command to run for each container is built into the pipeline file.
Ark-kun marked this conversation as resolved.
Show resolved Hide resolved
30 changes: 0 additions & 30 deletions samples/basic/README.md

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# limitations under the License.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

This file was deleted.

38 changes: 19 additions & 19 deletions test/sample-test/run_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ if [ "$TEST_NAME" == 'tf-training' ]; then

#TODO: convert the sed commands to sed -e 's|gcr.io/ml-pipeline/|gcr.io/ml-pipeline-test/' and tag replacement.
# Compile samples
cd ${BASE_DIR}/samples/kubeflow-tf
cd ${BASE_DIR}/samples/core/kubeflow-tf

if [ -n "${DATAFLOW_TFT_IMAGE}" ];then
sed -i -e "s|gcr.io/ml-pipeline/ml-pipeline-dataflow-tft:\([a-zA-Z0-9_.-]\)\+|${DATAFLOW_TFT_IMAGE}|g" kubeflow-training-classification.py
Expand All @@ -149,7 +149,7 @@ if [ "$TEST_NAME" == 'tf-training' ]; then
dsl-compile --py kubeflow-training-classification.py --output kubeflow-training-classification.zip

cd "${TEST_DIR}"
python3 run_kubeflow_test.py --input ${BASE_DIR}/samples/kubeflow-tf/kubeflow-training-classification.zip --result $SAMPLE_KUBEFLOW_TEST_RESULT --output $SAMPLE_KUBEFLOW_TEST_OUTPUT --namespace ${NAMESPACE}
python3 run_kubeflow_test.py --input ${BASE_DIR}/samples/core/kubeflow-tf/kubeflow-training-classification.zip --result $SAMPLE_KUBEFLOW_TEST_RESULT --output $SAMPLE_KUBEFLOW_TEST_OUTPUT --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_KUBEFLOW_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_KUBEFLOW_TEST_RESULT}
Expand All @@ -158,7 +158,7 @@ elif [ "$TEST_NAME" == "tfx" ]; then
SAMPLE_TFX_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/tfx
cd ${BASE_DIR}/samples/core/tfx

dsl-compile --py taxi-cab-classification-pipeline.py --output taxi-cab-classification-pipeline.yaml

Expand All @@ -175,19 +175,19 @@ elif [ "$TEST_NAME" == "tfx" ]; then
fi

cd "${TEST_DIR}"
python3 run_tfx_test.py --input ${BASE_DIR}/samples/tfx/taxi-cab-classification-pipeline.yaml --result $SAMPLE_TFX_TEST_RESULT --output $SAMPLE_TFX_TEST_OUTPUT --namespace ${NAMESPACE}
python3 run_tfx_test.py --input ${BASE_DIR}/samples/core/tfx/taxi-cab-classification-pipeline.yaml --result $SAMPLE_TFX_TEST_RESULT --output $SAMPLE_TFX_TEST_OUTPUT --namespace ${NAMESPACE}
echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_TFX_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_TFX_TEST_RESULT}
elif [ "$TEST_NAME" == "sequential" ]; then
SAMPLE_SEQUENTIAL_TEST_RESULT=junit_SampleSequentialOutput.xml
SAMPLE_SEQUENTIAL_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/basic
cd ${BASE_DIR}/samples/core/sequential
dsl-compile --py sequential.py --output sequential.zip

cd "${TEST_DIR}"
python3 run_basic_test.py --input ${BASE_DIR}/samples/basic/sequential.zip --result $SAMPLE_SEQUENTIAL_TEST_RESULT --output $SAMPLE_SEQUENTIAL_TEST_OUTPUT --testname sequential --namespace ${NAMESPACE}
python3 run_basic_test.py --input ${BASE_DIR}/samples/core/sequential/sequential.zip --result $SAMPLE_SEQUENTIAL_TEST_RESULT --output $SAMPLE_SEQUENTIAL_TEST_OUTPUT --testname sequential --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_SEQUENTIAL_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_SEQUENTIAL_TEST_RESULT}
Expand All @@ -196,11 +196,11 @@ elif [ "$TEST_NAME" == "condition" ]; then
SAMPLE_CONDITION_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/basic
cd ${BASE_DIR}/samples/core/condition
dsl-compile --py condition.py --output condition.zip

cd "${TEST_DIR}"
python3 run_basic_test.py --input ${BASE_DIR}/samples/basic/condition.zip --result $SAMPLE_CONDITION_TEST_RESULT --output $SAMPLE_CONDITION_TEST_OUTPUT --testname condition --namespace ${NAMESPACE}
python3 run_basic_test.py --input ${BASE_DIR}/samples/core/condition/condition.zip --result $SAMPLE_CONDITION_TEST_RESULT --output $SAMPLE_CONDITION_TEST_OUTPUT --testname condition --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_CONDITION_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_CONDITION_TEST_RESULT}
Expand All @@ -209,11 +209,11 @@ elif [ "$TEST_NAME" == "exithandler" ]; then
SAMPLE_EXIT_HANDLER_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/basic
cd ${BASE_DIR}/samples/core/exit_handler
dsl-compile --py exit_handler.py --output exit_handler.zip

cd "${TEST_DIR}"
python3 run_basic_test.py --input ${BASE_DIR}/samples/basic/exit_handler.zip --result $SAMPLE_EXIT_HANDLER_TEST_RESULT --output $SAMPLE_EXIT_HANDLER_TEST_OUTPUT --testname exithandler --namespace ${NAMESPACE}
python3 run_basic_test.py --input ${BASE_DIR}/samples/core/exit_handler/exit_handler.zip --result $SAMPLE_EXIT_HANDLER_TEST_RESULT --output $SAMPLE_EXIT_HANDLER_TEST_OUTPUT --testname exithandler --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_EXIT_HANDLER_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_EXIT_HANDLER_TEST_RESULT}
Expand All @@ -222,11 +222,11 @@ elif [ "$TEST_NAME" == "paralleljoin" ]; then
SAMPLE_PARALLEL_JOIN_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/basic
cd ${BASE_DIR}/samples/core/parallel_join
dsl-compile --py parallel_join.py --output parallel_join.zip

cd "${TEST_DIR}"
python3 run_basic_test.py --input ${BASE_DIR}/samples/basic/parallel_join.zip --result $SAMPLE_PARALLEL_JOIN_TEST_RESULT --output $SAMPLE_PARALLEL_JOIN_TEST_OUTPUT --testname paralleljoin --namespace ${NAMESPACE}
python3 run_basic_test.py --input ${BASE_DIR}/samples/core/parallel_join/parallel_join.zip --result $SAMPLE_PARALLEL_JOIN_TEST_RESULT --output $SAMPLE_PARALLEL_JOIN_TEST_OUTPUT --testname paralleljoin --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_PARALLEL_JOIN_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_PARALLEL_JOIN_TEST_RESULT}
Expand All @@ -235,11 +235,11 @@ elif [ "$TEST_NAME" == "recursion" ]; then
SAMPLE_RECURSION_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/basic
cd ${BASE_DIR}/samples/core/recursion
dsl-compile --py recursion.py --output recursion.tar.gz

cd "${TEST_DIR}"
python3 run_basic_test.py --input ${BASE_DIR}/samples/basic/recursion.tar.gz --result $SAMPLE_RECURSION_TEST_RESULT --output $SAMPLE_RECURSION_TEST_OUTPUT --testname recursion --namespace ${NAMESPACE}
python3 run_basic_test.py --input ${BASE_DIR}/samples/core/recursion/recursion.tar.gz --result $SAMPLE_RECURSION_TEST_RESULT --output $SAMPLE_RECURSION_TEST_OUTPUT --testname recursion --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_RECURSION_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_RECURSION_TEST_RESULT}
Expand All @@ -248,7 +248,7 @@ elif [ "$TEST_NAME" == "xgboost" ]; then
SAMPLE_XGBOOST_TEST_OUTPUT=${RESULTS_GCS_DIR}

# Compile samples
cd ${BASE_DIR}/samples/xgboost-spark
cd ${BASE_DIR}/samples/core/xgboost-spark

dsl-compile --py xgboost-training-cm.py --output xgboost-training-cm.yaml

Expand All @@ -263,7 +263,7 @@ elif [ "$TEST_NAME" == "xgboost" ]; then
sed -i -e "s|gcr.io/ml-pipeline/ml-pipeline-local-roc:\([a-zA-Z0-9_.-]\)\+|${LOCAL_ROC_IMAGE}|g" xgboost-training-cm.yaml
fi
cd "${TEST_DIR}"
python3 run_xgboost_test.py --input ${BASE_DIR}/samples/xgboost-spark/xgboost-training-cm.yaml --result $SAMPLE_XGBOOST_TEST_RESULT --output $SAMPLE_XGBOOST_TEST_OUTPUT --namespace ${NAMESPACE}
python3 run_xgboost_test.py --input ${BASE_DIR}/samples/core/xgboost-spark/xgboost-training-cm.yaml --result $SAMPLE_XGBOOST_TEST_RESULT --output $SAMPLE_XGBOOST_TEST_OUTPUT --namespace ${NAMESPACE}

echo "Copy the test results to GCS ${RESULTS_GCS_DIR}/"
gsutil cp ${SAMPLE_XGBOOST_TEST_RESULT} ${RESULTS_GCS_DIR}/${SAMPLE_XGBOOST_TEST_RESULT}
Expand All @@ -275,7 +275,7 @@ elif [ "$TEST_NAME" == "notebook-tfx" ]; then
DEPLOYER_MODEL=`cat /proc/sys/kernel/random/uuid`
DEPLOYER_MODEL=Notebook_tfx_taxi_`echo ${DEPLOYER_MODEL//-/_}`

cd ${BASE_DIR}/samples/notebooks
cd ${BASE_DIR}/samples/core/kubeflow_pipeline_using_TFX_OSS_components
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
if [ -n "${DATAFLOW_TFT_IMAGE}" ];then
Expand Down Expand Up @@ -310,7 +310,7 @@ elif [ "$TEST_NAME" == "notebook-lightweight" ]; then
SAMPLE_NOTEBOOK_LIGHTWEIGHT_TEST_RESULT=junit_SampleNotebookLightweightOutput.xml
SAMPLE_NOTEBOOK_LIGHTWEIGHT_TEST_OUTPUT=${RESULTS_GCS_DIR}

cd ${BASE_DIR}/samples/notebooks
cd ${BASE_DIR}/samples/core/lightweight_component
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
papermill --prepare-only -p EXPERIMENT_NAME notebook-lightweight -p PROJECT_NAME ml-pipeline-test -p KFP_PACKAGE /tmp/kfp.tar.gz Lightweight\ Python\ components\ -\ basics.ipynb notebook-lightweight.ipynb
Expand All @@ -327,7 +327,7 @@ elif [ "$TEST_NAME" == "notebook-typecheck" ]; then
SAMPLE_NOTEBOOK_TYPECHECK_TEST_RESULT=junit_SampleNotebookTypecheckOutput.xml
SAMPLE_NOTEBOOK_TYPECHECK_TEST_OUTPUT=${RESULTS_GCS_DIR}

cd ${BASE_DIR}/samples/notebooks
cd ${BASE_DIR}/samples/core/dsl_static_type_checking
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
papermill --prepare-only -p KFP_PACKAGE /tmp/kfp.tar.gz DSL\ Static\ Type\ Checking.ipynb notebook-typecheck.ipynb
Expand Down