Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/rebase master #4498

Merged
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
2b29f86
prepare release v2.210.0
Feb 28, 2024
c6e93f9
update development version to v2.210.1.dev0
Feb 28, 2024
2f1bed0
feat: Add new Triton DLC URIs (#4432)
nikhil-sk Feb 28, 2024
bb48c73
feat: Support selective pipeline execution between function step and …
qidewenwhen Feb 29, 2024
c4d6c65
feat: Add AutoMLV2 support (#4461)
repushko Mar 1, 2024
b4a726f
feature: Add TensorFlow 2.14 image configs (#4446)
saimidu Mar 1, 2024
00327fc
fix: remove enable_network_isolation from the python doc (#4465)
rohangujarathi Mar 1, 2024
4e5155c
doc: Add doc for new feature processor APIs and classes (#4250)
can-sun Mar 1, 2024
ac4e861
fix: properly close sagemaker config file after loading config (#4457)
jmahlik Mar 4, 2024
b857ead
feat: instance specific jumpstart host requirements (#4397)
evakravi Mar 4, 2024
72fd0fa
change: Bump Apache Airflow version to 2.8.2 (#4470)
knikure Mar 4, 2024
892ba38
fix: make sure gpus are found in local_gpu run (#4384)
gverkes Mar 4, 2024
9a26978
feat: pin dll version to support python3.11 to the sdk (#4472)
akrishna1995 Mar 4, 2024
69a9fcd
fix: Skip No Canvas regions for test_deploy_best_candidate (#4477)
knikure Mar 5, 2024
0a48a8a
prepare release v2.211.0
Mar 5, 2024
8036ad3
update development version to v2.211.1.dev0
Mar 5, 2024
55940ad
change: Enhance model builder selection logic to include model size (…
samruds Mar 6, 2024
45a471f
change: Upgrade smp to version 2.2 (#4479)
adtian2 Mar 6, 2024
b426c21
feat: Update SM Python SDK for PT 2.2.0 SM DLC (#4481)
sirutBuasai Mar 6, 2024
7000f25
fix: Create custom tarfile extractall util to fix backward compatibil…
knikure Mar 6, 2024
48d501f
prepare release v2.212.0
Mar 6, 2024
4c8e0fc
update development version to v2.212.1.dev0
Mar 6, 2024
65f2ddf
change: Update tblib constraint (#4452)
dbushy727 Mar 7, 2024
3e9e04d
fix: make unit tests compatible with pytest-xdist (#4486)
benieric Mar 7, 2024
554d720
feature: Add overriding logic in ModelBuilder when task is provided (…
xiongz945 Mar 8, 2024
28a1665
feature: Accept user-defined env variables for the entry-point (#4175)
martinRenou Mar 8, 2024
c8f78f7
fix: Move sagemaker pysdk version check after bootstrap in remote job…
qidewenwhen Mar 11, 2024
f9fb1b9
change: enable github actions for PRs (#4489)
benieric Mar 12, 2024
e95ed65
feature: Add ModelDataSource and SourceUri support for model package …
mrudulmn Mar 12, 2024
525e9ae
feat: support JumpStart proprietary models (#4467)
Captainia Mar 12, 2024
95780d3
chore: emit warning when no instance specific gated training env var …
evakravi Mar 13, 2024
2d638eb
change: bump jinja2 to 3.1.3 in doc/requirments.txt (#4421) (#4423)
evakravi Feb 16, 2024
29718b4
feat: add hub and hubcontent support in retrieval function for jumpst…
bencrabtree Feb 21, 2024
d7c4307
feat: jsch jumpstart estimator support (#4439)
bencrabtree Feb 26, 2024
92e35c8
Master jumpstart curated hub (#4464)
bencrabtree Feb 28, 2024
d2f72a2
add hub_arn support for accept_types, content_types, serializers, des…
bencrabtree Feb 28, 2024
52eae82
feature: JumpStart CuratedHub class creation and function definitions…
jinyoung-lim Feb 29, 2024
c450168
MultiPartCopy with Sync Algorithm (#4475)
bencrabtree Mar 12, 2024
da1b642
rebase with master
bencrabtree Mar 13, 2024
709bedc
bad rebase
bencrabtree Mar 13, 2024
d2dd9be
trying to fix codecov
bencrabtree Mar 15, 2024
fa6a3ba
uncomment codebuild-ci
bencrabtree Mar 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/workflows/codebuild-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# name: PR Checks
# on:
# pull_request_target:

# concurrency:
# group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.head_ref }}
# cancel-in-progress: true

# permissions:
# id-token: write # This is required for requesting the JWT

# jobs:
# codestyle-doc-tests:
# runs-on: ubuntu-latest
# steps:
# - name: Configure AWS Credentials
# uses: aws-actions/configure-aws-credentials@v4
# with:
# role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
# aws-region: us-west-2
# role-duration-seconds: 10800
# - name: Run Codestyle & Doc Tests
# uses: aws-actions/aws-codebuild-run-build@v1
# with:
# project-name: sagemaker-python-sdk-ci-codestyle-doc-tests
# source-version-override: 'pr/${{ github.event.pull_request.number }}'
# unit-tests:
# runs-on: ubuntu-latest
# strategy:
# fail-fast: false
# matrix:
# python-version: ["py38", "py39", "py310"]
# steps:
# - name: Configure AWS Credentials
# uses: aws-actions/configure-aws-credentials@v4
# with:
# role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
# aws-region: us-west-2
# role-duration-seconds: 10800
# - name: Run Unit Tests
# uses: aws-actions/aws-codebuild-run-build@v1
# with:
# project-name: sagemaker-python-sdk-ci-unit-tests
# source-version-override: 'pr/${{ github.event.pull_request.number }}'
# env-vars-for-codebuild: |
# PY_VERSION
# env:
# PY_VERSION: ${{ matrix.python-version }}
50 changes: 50 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,55 @@
# Changelog

## v2.212.0 (2024-03-06)

### Features

- Update SM Python SDK for PT 2.2.0 SM DLC

### Bug Fixes and Other Changes

- Create custom tarfile extractall util to fix backward compatibility issue
- Upgrade smp to version 2.2
- Enhance model builder selection logic to include model size

## v2.211.0 (2024-03-05)

### Features

- pin dll version to support python3.11 to the sdk
- instance specific jumpstart host requirements
- Add TensorFlow 2.14 image configs
- Add AutoMLV2 support
- Support selective pipeline execution between function step and regular step
- Add new Triton DLC URIs

### Bug Fixes and Other Changes

- Skip No Canvas regions for test_deploy_best_candidate
- make sure gpus are found in local_gpu run
- Bump Apache Airflow version to 2.8.2
- properly close sagemaker config file after loading config
- remove enable_network_isolation from the python doc

### Documentation Changes

- Add doc for new feature processor APIs and classes

## v2.210.0 (2024-02-28)

### Features

- Prepend SageMaker Studio App Type to boto3 User Agent string
- TGI optimum 0.0.18 (general+llm)
- TGI 1.4.2

### Bug Fixes and Other Changes

- tolerate vulnerable old model for integ test and temporarily skip test_list_jumpstart_models_script_filter
- add missing regions to pytorch config
- Add validation for sagemaker version on remote job
- fixed implementation of fail_on_violation for transform with monitoring

## v2.209.0 (2024-02-24)

### Features
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.209.1.dev0
2.212.1.dev0
18 changes: 16 additions & 2 deletions doc/api/prep_data/feature_store.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ Feature Definition
:members:
:show-inheritance:


Inputs
******

Expand Down Expand Up @@ -181,9 +182,13 @@ Feature Processor Data Source
:members:
:show-inheritance:

.. autoclass:: sagemaker.feature_store.feature_processor.PySparkDataSource
:members:
:show-inheritance:

Feature Processor Scheduler
***************************

Feature Processor Scheduler and Triggers
****************************************

.. automethod:: sagemaker.feature_store.feature_processor.to_pipeline

Expand All @@ -196,3 +201,12 @@ Feature Processor Scheduler
.. automethod:: sagemaker.feature_store.feature_processor.describe

.. automethod:: sagemaker.feature_store.feature_processor.list_pipelines

.. automethod:: sagemaker.feature_store.feature_processor.put_trigger

.. automethod:: sagemaker.feature_store.feature_processor.enable_trigger

.. automethod:: sagemaker.feature_store.feature_processor.disable_trigger

.. automethod:: sagemaker.feature_store.feature_processor.delete_trigger

7 changes: 7 additions & 0 deletions doc/api/training/automlv2.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
AutoMLV2
--------

.. automodule:: sagemaker.automl.automlv2
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions doc/api/training/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Training APIs
algorithm
analytics
automl
automlv2
debugger
estimators
tuner
Expand Down
83 changes: 68 additions & 15 deletions doc/doc_utils/jumpstart_doc_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,12 @@ class Frameworks(str, Enum):

JUMPSTART_REGION = "eu-west-2"
SDK_MANIFEST_FILE = "models_manifest.json"
PROPRIETARY_SDK_MANIFEST_FILE = "proprietary-sdk-manifest.json"
JUMPSTART_BUCKET_BASE_URL = "https://jumpstart-cache-prod-{}.s3.{}.amazonaws.com".format(
JUMPSTART_REGION, JUMPSTART_REGION
)
PROPRIETARY_DOC_BUCKET = "https://jumpstart-cache-prod-us-west-2.s3.us-west-2.amazonaws.com"

TASK_MAP = {
Tasks.IC: ProblemTypes.IMAGE_CLASSIFICATION,
Tasks.IC_EMBEDDING: ProblemTypes.IMAGE_EMBEDDING,
Expand Down Expand Up @@ -152,18 +155,26 @@ class Frameworks(str, Enum):
}


def get_jumpstart_sdk_manifest():
url = "{}/{}".format(JUMPSTART_BUCKET_BASE_URL, SDK_MANIFEST_FILE)
def get_public_s3_json_object(url):
with request.urlopen(url) as f:
models_manifest = f.read().decode("utf-8")
return json.loads(models_manifest)


def get_jumpstart_sdk_spec(key):
url = "{}/{}".format(JUMPSTART_BUCKET_BASE_URL, key)
with request.urlopen(url) as f:
model_spec = f.read().decode("utf-8")
return json.loads(model_spec)
def get_jumpstart_sdk_manifest():
return get_public_s3_json_object(f"{JUMPSTART_BUCKET_BASE_URL}/{SDK_MANIFEST_FILE}")


def get_proprietary_sdk_manifest():
return get_public_s3_json_object(f"{PROPRIETARY_DOC_BUCKET}/{PROPRIETARY_SDK_MANIFEST_FILE}")


def get_jumpstart_sdk_spec(s3_key: str):
return get_public_s3_json_object(f"{JUMPSTART_BUCKET_BASE_URL}/{s3_key}")


def get_proprietary_sdk_spec(s3_key: str):
return get_public_s3_json_object(f"{PROPRIETARY_DOC_BUCKET}/{s3_key}")


def get_model_task(id):
Expand Down Expand Up @@ -196,6 +207,45 @@ def get_model_source(url):
return "Source"


def create_proprietary_model_table():
proprietary_content_intro = []
proprietary_content_intro.append("\n")
proprietary_content_intro.append(".. list-table:: Available Proprietary Models\n")
proprietary_content_intro.append(" :widths: 50 20 20 20 20\n")
proprietary_content_intro.append(" :header-rows: 1\n")
proprietary_content_intro.append(" :class: datatable\n")
proprietary_content_intro.append("\n")
proprietary_content_intro.append(" * - Model ID\n")
proprietary_content_intro.append(" - Fine Tunable?\n")
proprietary_content_intro.append(" - Supported Version\n")
proprietary_content_intro.append(" - Min SDK Version\n")
proprietary_content_intro.append(" - Source\n")

sdk_manifest = get_proprietary_sdk_manifest()
sdk_manifest_top_versions_for_models = {}

for model in sdk_manifest:
if model["model_id"] not in sdk_manifest_top_versions_for_models:
sdk_manifest_top_versions_for_models[model["model_id"]] = model
else:
if str(sdk_manifest_top_versions_for_models[model["model_id"]]["version"]) < str(
model["version"]
):
sdk_manifest_top_versions_for_models[model["model_id"]] = model

proprietary_content_entries = []
for model in sdk_manifest_top_versions_for_models.values():
model_spec = get_proprietary_sdk_spec(model["spec_key"])
proprietary_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
proprietary_content_entries.append(" - {}\n".format(False)) # TODO: support training
proprietary_content_entries.append(" - {}\n".format(model["version"]))
proprietary_content_entries.append(" - {}\n".format(model["min_version"]))
proprietary_content_entries.append(
" - `{} <{}>`__ |external-link|\n".format("Source", model_spec.get("url"))
)
return proprietary_content_intro + proprietary_content_entries + ["\n"]


def create_jumpstart_model_table():
sdk_manifest = get_jumpstart_sdk_manifest()
sdk_manifest_top_versions_for_models = {}
Expand Down Expand Up @@ -249,19 +299,19 @@ def create_jumpstart_model_table():
file_content_intro.append(" - Source\n")

dynamic_table_files = []
file_content_entries = []
open_weight_content_entries = []

for model in sdk_manifest_top_versions_for_models.values():
model_spec = get_jumpstart_sdk_spec(model["spec_key"])
model_task = get_model_task(model_spec["model_id"])
string_model_task = get_string_model_task(model_spec["model_id"])
model_source = get_model_source(model_spec["url"])
file_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
file_content_entries.append(" - {}\n".format(model_spec["training_supported"]))
file_content_entries.append(" - {}\n".format(model["version"]))
file_content_entries.append(" - {}\n".format(model["min_version"]))
file_content_entries.append(" - {}\n".format(model_task))
file_content_entries.append(
open_weight_content_entries.append(" * - {}\n".format(model_spec["model_id"]))
open_weight_content_entries.append(" - {}\n".format(model_spec["training_supported"]))
open_weight_content_entries.append(" - {}\n".format(model["version"]))
open_weight_content_entries.append(" - {}\n".format(model["min_version"]))
open_weight_content_entries.append(" - {}\n".format(model_task))
open_weight_content_entries.append(
" - `{} <{}>`__ |external-link|\n".format(model_source, model_spec["url"])
)

Expand Down Expand Up @@ -299,7 +349,10 @@ def create_jumpstart_model_table():
f.writelines(file_content_single_entry)
f.close()

proprietary_content_entries = create_proprietary_model_table()

f = open("doc_utils/pretrainedmodels.rst", "a")
f.writelines(file_content_intro)
f.writelines(file_content_entries)
f.writelines(open_weight_content_entries)
f.writelines(proprietary_content_entries)
f.close()
1 change: 1 addition & 0 deletions doc/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ docutils==0.15.2
packaging==20.9
jinja2==3.1.3
schema==0.7.5
accelerate>=0.24.1,<=0.27.0
1 change: 1 addition & 0 deletions requirements/extras/huggingface_requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
accelerate>=0.24.1,<=0.27.0
3 changes: 2 additions & 1 deletion requirements/extras/test_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ awslogs==0.14.0
black==22.3.0
stopit==1.1.2
# Update tox.ini to have correct version of airflow constraints file
apache-airflow==2.8.1
apache-airflow==2.8.2
apache-airflow-providers-amazon==7.2.1
attrs>=23.1.0,<24
fabric==2.6.0
Expand All @@ -39,3 +39,4 @@ tritonclient[http]<2.37.0
onnx==1.14.1
# tf2onnx==1.15.1
nbformat>=5.9,<6
accelerate>=0.24.1,<=0.27.0
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ def read_requirements(filename):
"PyYAML~=6.0",
"jsonschema",
"platformdirs",
"tblib>=1.7.0,<3",
"tblib>=1.7.0,<4",
"urllib3>=1.26.8,<3.0.0",
"requests",
"docker",
Expand All @@ -79,6 +79,7 @@ def read_requirements(filename):
"feature-processor": read_requirements(
"requirements/extras/feature-processor_requirements.txt"
),
"huggingface": read_requirements("requirements/extras/huggingface_requirements.txt"),
}
# Meta dependency groups
extras["all"] = [item for group in extras.values() for item in group]
Expand Down
11 changes: 11 additions & 0 deletions src/sagemaker/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,17 @@

from sagemaker.automl.automl import AutoML, AutoMLJob, AutoMLInput # noqa: F401
from sagemaker.automl.candidate_estimator import CandidateEstimator, CandidateStep # noqa: F401
from sagemaker.automl.automlv2 import ( # noqa: F401
AutoMLV2,
AutoMLJobV2,
LocalAutoMLDataChannel,
AutoMLDataChannel,
AutoMLTimeSeriesForecastingConfig,
AutoMLImageClassificationConfig,
AutoMLTabularConfig,
AutoMLTextClassificationConfig,
AutoMLTextGenerationConfig,
)

from sagemaker.debugger import ProfilerConfig, Profiler # noqa: F401

Expand Down
3 changes: 3 additions & 0 deletions src/sagemaker/accept_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

from sagemaker.jumpstart import artifacts, utils as jumpstart_utils
from sagemaker.jumpstart.constants import DEFAULT_JUMPSTART_SAGEMAKER_SESSION
from sagemaker.jumpstart.enums import JumpStartModelType
from sagemaker.session import Session


Expand Down Expand Up @@ -80,6 +81,7 @@ def retrieve_default(
tolerate_vulnerable_model: bool = False,
tolerate_deprecated_model: bool = False,
sagemaker_session: Session = DEFAULT_JUMPSTART_SAGEMAKER_SESSION,
model_type: JumpStartModelType = JumpStartModelType.OPEN_WEIGHTS,
) -> str:
"""Retrieves the default accept type for the model matching the given arguments.

Expand Down Expand Up @@ -122,4 +124,5 @@ def retrieve_default(
tolerate_vulnerable_model=tolerate_vulnerable_model,
tolerate_deprecated_model=tolerate_deprecated_model,
sagemaker_session=sagemaker_session,
model_type=model_type,
)
Loading