rapidsai · rapids-bot · May 28, 2021 · May 21, 2021 · May 27, 2021 · dantegd
diff --git a/docs/source/pickling_cuml_models.ipynb b/docs/source/pickling_cuml_models.ipynb
@@ -183,6 +183,70 @@
    "source": [
     "single_gpu_model.cluster_centers_"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Exporting cuML Random Forest models for inferencing on machines without GPUs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Starting with cuML version 21.06, you can export cuML Random Forest models and run predictions with them on machines without an NVIDIA GPUs. The [Treelite](https://github.com/dmlc/treelite) package defines an efficient exchange format that lets you portably move the cuML Random Forest models to other machines. We will refer to the exchange format as \"checkpoints.\"\n",
+    "\n",
+    "Here are the steps to export the model:\n",
+    "\n",
+    "1. Call `to_treelite_checkpoint()` to obtain the checkpoint file from the cuML Random Forest model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from cuml.ensemble import RandomForestClassifier as cumlRandomForestClassifier\n",
+    "from sklearn.datasets import load_iris\n",
+    "import numpy as np\n",
+    "\n",
+    "X, y = load_iris(return_X_y=True)\n",
+    "X, y = X.astype(np.float32), y.astype(np.int32)\n",
+    "clf = cumlRandomForestClassifier(max_depth=3, random_state=0, n_estimators=10)\n",
+    "clf.fit(X, y)\n",
+    "\n",
+    "checkpoint_path = './checkpoint.tl'\n",
+    "# Export cuML RF model as Treelite checkpoint\n",
+    "clf.convert_to_treelite_model().to_treelite_checkpoint(checkpoint_path)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "2. Copy the generated checkpoint file `checkpoint.tl` to another machine on which you'd like to run predictions.\n",
+    "\n",
+    "3. On the target machine, install Treelite by running `pip install treelite` or `conda install -c conda-forge treelite`. The machine does not need to have an NVIDIA GPUs and does not need to have cuML installed.\n",
+    "\n",
+    "4. You can now load the model from the checkpoint, by running the following on the target machine:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import treelite\n",
+    "\n",
+    "# The checkpoint file has been copied over\n",
+    "checkpoint_path = './checkpoint.tl'\n",
+    "tl_model = treelite.Model.deserialize(checkpoint_path)\n",
+    "out_prob = treelite.gtil.predict(tl_model, X, pred_margin=True)\n",
+    "print(out_prob)"
+   ]
   }
  ],
  "metadata": {
@@ -201,7 +265,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.6"
+   "version": "3.8.8"
   }
  },
  "nbformat": 4,

@@ -134,6 +134,11 @@ class RandomForestClassifier(BaseRandomForestModel,
       histogram-based algorithm to determine splits, rather than an exact
       count. You can tune the size of the histograms with the n_bins parameter.
 
+    .. note:: You can export cuML Random Forest models and run predictions
+      with them on machines without an NVIDIA GPUs. See
+      https://docs.rapids.ai/api/cuml/nightly/pickling_cuml_models.html
+      for more details.
+
     **Known Limitations**: This is an early release of the cuML
     Random Forest code. It contains a few known limitations:
 

@@ -117,6 +117,11 @@ class RandomForestRegressor(BaseRandomForestModel,
       histogram-based algorithm to determine splits, rather than an exact
       count. You can tune the size of the histograms with the n_bins parameter.
 
+    .. note:: You can export cuML Random Forest models and run predictions
+      with them on machines without an NVIDIA GPUs. See
+      https://docs.rapids.ai/api/cuml/nightly/pickling_cuml_models.html
+      for more details.
+
     **Known Limitations**: This is an early release of the cuML
     Random Forest code. It contains a few known limitations: