At the end of the tutorial, the user will be able to
- set up a Bedrock training pipeline
- log training-time feature and inference distributions
- log model explainability and fairness metrics
- check model explainability and fairness from Bedrock web UI
- deploy a model endpoint in HTTPS with logging inference and feature distributions
- monitor the endpoint by simulating a query stream
The data can be downloaded from Kaggle. We have already uploaded the dataset on GCS and AWS.
You can refer to the notebook for an overview of Bedrock ModelAnalyzer.
Just follow the Bedrock quickstart guide from Step 2 onwards. You can either test on Google Cloud or AWS by setting gcp
or aws
in the ENV_TYPE field on the Run pipeline page.
After successful run of training pipeline, clicking on the corresponding "Model version" will bring you to the "Model" page. You can select between "Metrics", "Explainability", "Fairness" and "File listing". You can also download the model artefacts saved during training.
In Explainability, you will be able to visualise top feature attributions for the model at a global level as well as the SHAP dependence for selected features. You can view individual explainability by selecting the row index from the sampled dataset.
In Fairness, you can visualise the fairness metrics on the Bedrock UI. You can select the protected attribute from the dropdown menu.
You can simulate a constant stream of queries with query_stream.py
. Replace "MODEL_ENDPOINT_URL" and "MODEL_ENDPOINT_TOKEN" in query_stream.py
. Run
python query_stream.py
On the "Endpoint" page, you can select between "API metrics", "Feature distribution" and "Inference distribution".
From API metrics, you can monitor the throughput, response time and error rates.
From Feature distribution, you can compare training-time and production-time distributions for each feature in the form of CDF and PDF plots. Note that the production plots will only appear after 15 minutes.
Similarly, from Inference distribution, you can compare training-time and production-time distributions of inferences.
The Streamlit app provides a demonstration of credit risk analysis.
The data is from Kaggle's Home Credit Default Risk competition.
The data processing and model training codes are adapted from this Kaggle kernel.