In the Azure Portal, create a new Machine Learning service workspace
resource:
- Workspace name:
azure-ml-bootcamp
- Resource Group:
azure-ml-bootcamp
- Location:
East US
(cheaper thanWest Europe
and for our bootcamp sufficient)
Let's have a look at our Resource Group:
- Application Insights - used for monitoring our models in production (will be used later)
- Storage account - this will store our logs, model outputs, training/testing data, etc.
- Key vault - stores our secrets (will be used later)
- Machine Learning service workspace - the center point for Machine Learning on Azure
Inside our Machine Learning service workspace
, we'll create a new Compute VM
:
Hit + New
,keep it as STANDARD_D3_V2
and give it a unique name:
It'll take a few minutes until the VM has been created. The primary use for this VM is that we all have the same Jupyter environment. In this exercise, we'll use this VM to train a simple Machine Learning model. In a real-world setup, we might consider using a GPU-enable instance, in case we need to perform Deep Learning or just rely on Azure Machine Learning Compute (challenge 2).
It'll take ~3 minutes until the VM is provisioned and ready to use.
Once it is running, the UI will already give us a links to Jupyter
, JupyterLab
and RStudio
. To keep things simple, we'll use Jupyter
throughout this bootcamp, but if you feel adventurous, use JupyerLab
or RStudio
solving the challenges in R.
You'll be using your AAD (Azure Active Directory) user to log into Jupyter
. From a enterprise security point, this is a big plus. No extra credentials needed! 🙌
Inside the newly created Compute VM, first create a new folder via the New
button on the top right of Jupyter. Everything we'll do in this workshop should happen in this folder. This is because Machine Learning Services will persist the whole contents of the experiment's folder, which exceeds the limit when you run your Jupyter Notebooks in the root folder.
Note: The next block is not needed anymore (as of May 2019), but you'd need it if you want to connect to your Azure Machine Learning Workspace from e.g., your local machine. Since the Compute VM
runs inside the workspace, it automatically connects to the workspace it lives in.
# Ignore this block, unless you run Jupyer directly on e.g., your laptop
{
"subscription_id": "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",
"resource_group": "azure-ml-bootcamp",
"workspace_name": "azure-ml-bootcamp"
}
The config.json
is used by the Azure Machine Learning SDK
to connect to your Azure Machine Learning workspace
running in Azure.
Finally, we can click the New
button and create a new Notebook of type: Python 3.6 - AzureML
. A new browser tab should open up and we can click the name Untitled
and rename it to challenge01.ipynb
.
Inside your challenge01.ipynb
notebook, create a new cell:
from azureml.core import Workspace, Experiment, Run
ws = Workspace.from_config()
You can run or re-run any cell by hitting Run
or pressing Shift+Enter
or Ctrl+Enter
. Code cells have brackets left to them. If the brackets are empty [ ]
, the code has not been run. While the code is running, you will see an asterisk [*]
. After the code completes, a number [1]
appears. The number tells you in which order the cells ran. You can always re-run arbitrary cells, in case something didn't work on the first try.
This first cell imports the relevant libraries from the Azure Machine Learning SDK, reads our and connects the notebook to our Machine Learning Workspace in Azure. You will need to authenticate to your Azure subscription:config.json
Have a look at the following note when experiencing subscription ID errors (this should not happen any more when using a Azure Compute VM
):
!az account set -s "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx"
Once you have ran the cell, restart the Notebook kernel (Kernel
--> Restart & Clear Output
) and wait a few seconds until it has restarted.
Next, let's create a new experiment (this will later show up in our Workspace after you've ran the first experiment). This is where all our experiment runs will be logged to:
experiment = Experiment(workspace = ws, name = "scikit-learn-mnist")
Let's load some test data into our Compute VM (we'll do something more scalable in the next challenge):
import os
import urllib.request
os.makedirs('./data', exist_ok = True)
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename='./data/train-images.gz')
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename='./data/train-labels.gz')
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename='./data/test-images.gz')
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename='./data/test-labels.gz')
Let's create a fourth cell for training our model:
import numpy as np
import gzip
import struct
from sklearn.linear_model import LogisticRegression
from sklearn.externals import joblib
# load compressed MNIST gz files we just downloaded and return numpy arrays
def load_data(filename, label=False):
with gzip.open(filename) as gz:
struct.unpack('I', gz.read(4))
n_items = struct.unpack('>I', gz.read(4))
if not label:
n_rows = struct.unpack('>I', gz.read(4))[0]
n_cols = struct.unpack('>I', gz.read(4))[0]
res = np.frombuffer(gz.read(n_items[0] * n_rows * n_cols), dtype=np.uint8)
res = res.reshape(n_items[0], n_rows * n_cols)
else:
res = np.frombuffer(gz.read(n_items[0]), dtype=np.uint8)
res = res.reshape(n_items[0], 1)
return res
# We need to scale our data to values between 0 and 1
X_train = load_data('./data/train-images.gz', False) / 255.0
y_train = load_data('./data/train-labels.gz', True).reshape(-1)
X_test = load_data('./data/test-images.gz', False) / 255.0
y_test = load_data('./data/test-labels.gz', True).reshape(-1)
# Tell our Azure ML Workspace that a new run is starting
run = experiment.start_logging()
# Create a Logistic Regression classifier and train it
clf = LogisticRegression(multi_class='auto')
clf.fit(X_train, y_train)
# Predict classes of our testing dataset
y_pred = clf.predict(X_test)
# Calculate accuracy
acc = np.average(y_pred == y_test)
print('Accuracy is', acc)
# Log accuracy to our Azure ML Workspace
run.log('accuracy', acc)
# Tell our Azure ML Workspace that the run has completed
run.complete()
On our STANARD_D3_V2
instance, the code should take around ~1 minutes to run (any warnings you get can be ignored).
In summary, the code does the following things:
- Imports
sklearn
(scikit-learn) as the Machine Learning framework - Creates a helper function for loading our data (
load_data(...)
) - Loads our MNIST train and test data, and scales all values to
[0, 1]
- Tells our Azure ML Experiment to start logging a training run
- Creates a
LogisticRegression
-based classifier and trains it using the training data - Uses the classifier to predict the numbers in the test dataset
- Compares the predictions to the ground truth and calculates the accuracy score
- Logs to accuracy to our run and finishes the run
As we can see, our model achieves ~92%
accuracy, which is actually pretty low for the MNIST dataset - we'll get back to this in the next challenge!
In the Azure ML Workspace, we can see that our experiment is finally showing up:
Inside our experiment, we can see our first run:
If we click the run number, we can see its details:
We can track more values or even time series, which would directly show up as diagrams. However, as we want to keep the code short, we'll skip this part for now (more on that in challenge 2).
Finally, we can export our model and upload it to our Azure ML Workspace in the outputs
directory:
from sklearn.externals import joblib
# Write model to disk
joblib.dump(value=clf, filename='scikit-learn-mnist.pkl')
# Upload our model to our experiment
run.upload_file(name = 'outputs/scikit-learn-mnist.pkl', path_or_stream = './scikit-learn-mnist.pkl')
In the portal, we can now see the output of our run:
We can also query our tracked metrics and outputs for our current run:
print("Run metrics:", run.get_metrics())
print("Run model files", run.get_file_names())
As a last step, we can register (version, tag, and store) our model in our workspace:
model = run.register_model(model_name='scikit-learn-mnist-model', model_path='outputs/scikit-learn-mnist.pkl')
print(model.name, model.id, model.version, sep = '\t')
We probably would not do this for every model we train, but for those that we want to promote to the next stage and potentially consider for deployment.
Under the Models
tab, we can now see that our model has been registered:
Our model has been stored in the Storage Account that has been created initially for us:
At this point:
- We've trained a Machine Learning model using scikit-learn inside a
Compute VM
runningJupyter
- We achieved
92%
accuracy (not very good for this data set) - Azure ML knows about our experiment and our initial run and tracked metrics
- Azure ML saved our model file (
scikit-learn-mnist.pkl
) in Blob storage - We have registered our initial model as a Azure ML Model in our Workspace
If we have another look into our resource group azure-ml-bootcamp
, we can see that the Compute VM actually sits inside this group. It is just a regular Azure Virtual Machine:
Furthermore, we can go into our Workspace and also see it listed under Compute
:
In the next challenge, we'll build a more powerful model and use Azure Machine Learning Compute to train it on a remote cluster.