SageMaker endpoint magic command support (jupyterlab#215)

* WIP: New parameters for SageMaker Endpoint * Uses symbolic constants consistently * Updates sample notebook * Updates docs, sample notebook * Retitles section to be about magic commands * Removes AWS_SESSION_TOKEN * Links for more info * Update docs/source/users/index.md Co-authored-by: Piyush Jain <piyushjain@duck.com> * Additional copy edits per @3coins * Update docs/source/users/index.md --------- Co-authored-by: Piyush Jain <piyushjain@duck.com>
dbelgrod · Jun 8, 2023 · 2a3f727 · 2a3f727
1 parent ed95153
commit 2a3f727
Show file tree

Hide file tree

Showing 4 changed files with 200 additions and 7 deletions.
diff --git a/docs/source/users/index.md b/docs/source/users/index.md
@@ -48,7 +48,7 @@ Jupyter AI supports the following model providers:
 | HuggingFace Hub     | `huggingface_hub`    | `HUGGINGFACEHUB_API_TOKEN` | `huggingface_hub`, `ipywidgets`, `pillow` |
 | OpenAI              | `openai`             | `OPENAI_API_KEY`           | `openai`                        |
 | OpenAI (chat)       | `openai-chat`        | `OPENAI_API_KEY`           | `openai`                        |
-| SageMaker Endpoints | `sagemaker-endpoint` | N/A                        | `boto3`                         |
+| SageMaker          | `sagemaker-endpoint` | N/A                        | `boto3`                         |
 
 The environment variable names shown above are also the names of the settings keys used when setting up the chat interface.
 
@@ -177,20 +177,20 @@ To compose a message, type it in the text box at the bottom of the chat interfac
     alt='Screen shot of an example "Hello world" message sent to Jupyternaut, who responds with "Hello world, how are you today?"'
     class="screenshot" />
 
-### Usage with SageMaker Endpoints
+### Using the chat interrface with SageMaker endpoints
 
-Jupyter AI supports language models hosted on SageMaker Endpoints that use JSON
-APIs. The first step is to authenticate with AWS via the `boto3` SDK and have
+Jupyter AI supports language models hosted on SageMaker endpoints that use JSON
+schemas. The first step is to authenticate with AWS via the `boto3` SDK and have
 the credentials stored in the `default` profile.  Guidance on how to do this can
 be found in the
 [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).
 
-When selecting the SageMaker Endpoints provider in the settings panel, you will
+When selecting the SageMaker provider in the settings panel, you will
 see the following interface:
 
 <img src="../_static/chat-sagemaker-endpoints.png"
     width="50%"
-    alt='Screenshot of the settings panel with the SageMaker Endpoints provider selected.'
+    alt='Screenshot of the settings panel with the SageMaker provider selected.'
     class="screenshot" />
 
 Each of the additional fields under "Language model" is required. These fields
@@ -601,3 +601,26 @@ You can see a list of all aliases by running the `%ai list` command.
 Aliases' names can contain ASCII letters (uppercase and lowercase), numbers, hyphens, underscores, and periods. They may not contain colons. They may also not override built-in commands — run `%ai help` for a list of these commands.
 
 Aliases must refer to models or `LLMChain` objects; they cannot refer to other aliases.
+
+### Using magic commands with SageMaker endpoints
+
+You can use magic commands with models hosted using Amazon SageMaker.
+
+First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab. For more information about environment variables, see [Environment variables to configure the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) in AWS's documentation.
+
+Jupyter AI supports language models hosted on SageMaker endpoints that use JSON schemas. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile.  Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).
+
+You will need to deploy a model in SageMaker, then provide it as the model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).
+
+All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options. The example below presumes that you have deployed a model called `jumpstart-dft-hf-text2text-flan-t5-xl`.
+
+```
+%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={"text_inputs":"<prompt>"} --response-path=generated_texts.[0] -f code
+Write Python code to print "Hello world"
+```
+
+The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html) where the model is deployed, which in this case is `us-east-1`.
+
+The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `"<prompt>"`. For example, the request schema `{"text_inputs":"<prompt>"}` will submit a JSON object with the prompt stored under the `text_inputs` key.
+
+The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{"generated_texts":["<output>"]}`, its response path is `generated_texts.[0]`.
diff --git a/examples/sagemaker.ipynb b/examples/sagemaker.ipynb
@@ -0,0 +1,124 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "79ebfce1-f5ac-4e39-83db-11416e310e8e",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "# Jupyter AI with the SageMaker endpoint\n",
+    "\n",
+    "This demo showcases the IPython magics Jupyter AI provides out-of-the-box for Amazon SageMaker.\n",
+    "\n",
+    "First, make sure that you've set your `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables either before starting JupyterLab or using the `%env` magic command within JupyterLab.\n",
+    "\n",
+    "Then, load the IPython extension:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "24f3f446-2b1d-4802-a47c-d298c06fc86e",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%load_ext jupyter_ai"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "9f2b0270-1c33-4918-b534-4ec104f90141",
+   "metadata": {},
+   "source": [
+    "Jupyter AI supports language models hosted on SageMaker endpoints that use JSON APIs. Authenticate with AWS via the `boto3` SDK and have the credentials stored in the `default` profile.  Guidance on how to do this can be found in the [`boto3` documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html).\n",
+    "\n",
+    "You will need to deploy a model in SageMaker, then provide it as your model name (as `sagemaker-endpoint:my-model-name`). See the [documentation on how to deploy a JumpStart model](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-deploy.html).\n",
+    "\n",
+    "All SageMaker endpoint requests require you to specify the `--region-name`, `--request-schema`, and `--response-path` options.\n",
+    "\n",
+    "The `--region-name` parameter is set to the [AWS region code](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html), such as `us-east-1` or `eu-west-1`.\n",
+    "\n",
+    "The `--request-schema` parameter is the JSON object the endpoint expects as input, with the prompt being substituted into any value that matches the string literal `\"<prompt>\"`. For example, the request schema `{\"text_inputs\":\"<prompt>\"}` will submit a JSON object with the prompt stored under the `text_inputs` key.\n",
+    "\n",
+    "The `--response-path` option is a [JSONPath](https://goessner.net/articles/JsonPath/index.html) string that retrieves the language model's output from the endpoint's JSON response. For example, if your endpoint returns an object with the schema `{\"generated_texts\":[\"<output>\"]}`, its response path is `generated_texts.[0]`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "31f3e6e3-48cf-4e60-96d3-8b8e1dd34bec",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "AI generated code inserted below &#11015;&#65039;"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {
+      "text/html": {
+       "jupyter_ai": {
+        "model_id": "jumpstart-dft-hf-text2text-flan-t5-xl",
+        "provider_id": "sagemaker-endpoint"
+       }
+      }
+     },
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%ai sagemaker-endpoint:jumpstart-dft-hf-text2text-flan-t5-xl --region-name=us-east-1 --request-schema={\"text_inputs\":\"<prompt>\"} --response-path=generated_texts.[0] -f code\n",
+    "Write some Python code"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "42408ea8-7264-44bc-ac0c-6b5dd03134d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "a = [] b = [] c = [] d = ["
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4ad8c62e-b0a5-4091-94e3-4067ed8d6c4a",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/packages/jupyter-ai-magics/jupyter_ai_magics/magics.py b/packages/jupyter-ai-magics/jupyter_ai_magics/magics.py
@@ -478,11 +478,26 @@ def run_ai_cell(self, args: CellArgs, prompt: str):
                     f"An authentication token is required to use models from the {Provider.name} provider.\n"
                     f"Please specify it via `%env {auth_strategy.name}=token`. "
                 ) from None
-        
+
         # configure and instantiate provider
         provider_params = { "model_id": local_model_id }
         if provider_id == "openai-chat":
             provider_params["prefix_messages"] = self.transcript_openai
+        # for SageMaker, validate that required params are specified
+        if provider_id == "sagemaker-endpoint":
+            if (
+                args.region_name is None or
+                args.request_schema is None or
+                args.response_path is None
+            ):
+                raise ValueError(
+                    "When using the sagemaker-endpoint provider, you must specify all of " +
+                    "the --region-name, --request-schema, and --response-path options."
+                )
+            provider_params["region_name"] = args.region_name
+            provider_params["request_schema"] = args.request_schema
+            provider_params["response_path"] = args.response_path
+
         provider = Provider(**provider_params)
 
         # generate output from model via provider

diff --git a/packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py b/packages/jupyter-ai-magics/jupyter_ai_magics/parsers.py
@@ -6,17 +6,42 @@
 FORMAT_CHOICES = list(get_args(FORMAT_CHOICES_TYPE))
 FORMAT_HELP = """IPython display to use when rendering output. [default="markdown"]"""
 
+REGION_NAME_SHORT_OPTION = '-n'
+REGION_NAME_LONG_OPTION = '--region-name'
+REGION_NAME_HELP = ("AWS region name, e.g. 'us-east-1'. Required for SageMaker provider; " +
+    "does nothing with other providers.")
+
+REQUEST_SCHEMA_SHORT_OPTION = '-q'
+REQUEST_SCHEMA_LONG_OPTION = '--request-schema'
+REQUEST_SCHEMA_HELP = ("The JSON object the endpoint expects, with the prompt being " +
+    "substituted into any value that matches the string literal '<prompt>'. " +
+    "Required for SageMaker provider; does nothing with other providers.")
+
+RESPONSE_PATH_SHORT_OPTION = '-p'
+RESPONSE_PATH_LONG_OPTION = '--response-path'
+RESPONSE_PATH_HELP = ("A JSONPath string that retrieves the language model's output " +
+    "from the endpoint's JSON response. Required for SageMaker provider; " +
+    "does nothing with other providers.")
+
 class CellArgs(BaseModel):
     type: Literal["root"] = "root"
     model_id: str
     format: FORMAT_CHOICES_TYPE
     reset: bool
+    # The following parameters are required only for SageMaker models
+    region_name: Optional[str]
+    request_schema: Optional[str]
+    response_path: Optional[str]
 
 # Should match CellArgs, but without "reset"
 class ErrorArgs(BaseModel):
     type: Literal["error"] = "error"
     model_id: str
     format: FORMAT_CHOICES_TYPE
+    # The following parameters are required only for SageMaker models
+    region_name: Optional[str]
+    request_schema: Optional[str]
+    response_path: Optional[str]
 
 class HelpArgs(BaseModel):
     type: Literal["help"] = "help"
@@ -60,6 +85,9 @@ def get_help(self, ctx):
     help="""Clears the conversation transcript used when interacting with an
     OpenAI chat model provider. Does nothing with other providers."""
 )
+@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP)
+@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP)
+@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP)
 def cell_magic_parser(**kwargs):
     """
     Invokes a language model identified by MODEL_ID, with the prompt being
@@ -84,6 +112,9 @@ def line_magic_parser():
     default="markdown",
     help=FORMAT_HELP
 )
+@click.option(REGION_NAME_SHORT_OPTION, REGION_NAME_LONG_OPTION, required=False, help=REGION_NAME_HELP)
+@click.option(REQUEST_SCHEMA_SHORT_OPTION, REQUEST_SCHEMA_LONG_OPTION, required=False, help=REQUEST_SCHEMA_HELP)
+@click.option(RESPONSE_PATH_SHORT_OPTION, RESPONSE_PATH_LONG_OPTION, required=False, help=RESPONSE_PATH_HELP)
 def error_subparser(**kwargs):
     """
     Explains the most recent error. Takes the same options (except -r) as