The Accelerated Data Science (ADS) SDK is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.
The ADS SDK can be downloaded from PyPi, contributions welcome on GitHub
- Audi Autonomous Driving Dataset Repository
- Bank Graph Example Notebook
- Building a Forecaster using AutoMLx
- Building and Explaining a Classifier using AutoMLx
- Building and Explaining a Regressor using AutoMLx
- Building and Explaining a Text Classifier using AutoMLx
- Building and Explaining an Anomaly Detector using AutoMLx - Experimental
- Caltech Pedestrian Detection Benchmark Repository
- Connect to Oracle Big Data Service
- Fairness with AutoMLx
- Graph Analytics and Graph Machine Learning with PyPGX
- How to Read Data with fsspec from Oracle Big Data Service (BDS)
- Intel Extension for Scikit-Learn
- Introduction to ADSTuner
- Introduction to Model Version Set
- Introduction to SQL Magic
- Introduction to Streaming
- Introduction to the Oracle Cloud Infrastructure Data Flow Studio
- Loading Data With Pandas & Dask
- Model Evaluation with ADSEvaluator
- Natural Language Processing
- ONNX Integration with the Accelerated Data Science (ADS) SDK
- PySpark
- Spark NLP within Oracle Cloud Infrastructure Data Flow Studio
- Text Classification and Model Explanations using LIME
- Text Classification with Data Labeling Service Integration
- Text Extraction Using the Accelerated Data Science (ADS) SDK
- Train, Register, and Deploy a Generic Model
- Train, Register, and Deploy a LightGBM Model
- Train, Register, and Deploy a PyTorch Model
- Train, Register, and Deploy a TensorFlow Model
- Train, Register, and Deploy an XGBoost Model
- Train, register, and deploy HuggingFace Pipeline
- Train, register, and deploy Sklearn Model
- Using Data Catalog Metastore with DataFlow
- Using Data Catalog Metastore with PySpark
- Using Livy on the Big Data Service
- Visual Genome Repository
- Visualizing Data
- Working with Pipelines
- XGBoost with RAPIDS
Updated: 05/29/2023
Build an anomaly detection model using the experimental, fully unsupervised anomaly detection pipeline in Oracle AutoMLx for the public Credit Card Fraud dataset.
This notebook was developed on the conda pack with slug: automlx_p38_cpu_v2
automlx
anomaly detection
Universal Permissive License v 1.0
Updated: 05/29/2023
Build a classifier using the Oracle AutoMLx tool and binary data set of Census income data.
This notebook was developed on the conda pack with slug: automlx_p38_cpu_v3
automlx
classification
classifier
Universal Permissive License v 1.0
Updated: 05/29/2023
Develop a model and evaluate its fairness
This notebook was developed on the conda pack with slug: automlx_p38_cpu_v3
automlx
fairness
Universal Permissive License v 1.0
Updated: 05/29/2023
Build a regressor using Oracle AutoMLx and a pricing data set. Training options will be explored and the resulting AutoMLx models will be evaluated.
This notebook was developed on the conda pack with slug: automlx_p38_cpu_v3
automlx
regression
Universal Permissive License v 1.0
Updated: 05/29/2023
build a classifier using the Oracle AutoMLx tool for the public 20newsgroup dataset
This notebook was developed on the conda pack with slug: automlx_p38_cpu_v3
automlx
text classification
text classifier
Universal Permissive License v 1.0.
Updated: 03/30/2023
Download, process and display autonomous driving data, and map LiDAR data onto images.
This notebook was developed on the conda pack with slug: computervision_p37_cpu_v1
autonomous driving
oracle open data
Universal Permissive License v 1.0
Updated: 03/26/2023
Work interactively with a BDS cluster using Livy and two different connection techniques, SparkMagic (for a notebook environment) and with REST.
This notebook was developed on the conda pack with slug: pyspark30_p37_cpu_v5
bds
big data service
livy
Universal Permissive License v 1.0
Updated: 03/29/2023
Manage data using fsspec file system. Read and save data using pandas and pyarrow through fsspec file system.
This notebook was developed on the conda pack with slug: pyspark30_p37_cpu_v5
bds
fsspec
Universal Permissive License v 1.0
Updated: 03/30/2023
Download and process annotated video data of vehicles and pedestrians.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
caltech
pedestrian detection
oracle open data
Universal Permissive License v 1.0
Updated: 03/26/2023
Write and test a Data Flow batch application using the Oracle Cloud Infrastructure (OCI) Data Catalog Metastore. Configure the job, run the application and clean up resources.
This notebook was developed on the conda pack with slug: pyspark30_p37_cpu_v5
data catalog metastore
data flow
Universal Permissive License v 1.0
Updated: 03/30/2023
Use the Oracle Cloud Infrastructure (OCI) Data Labeling service to efficiently build enriched, labeled datasets for the purpose of accurately training AI/ML models. This notebook demonstrates operations that can be performed using the Advanced Data Science (ADS) Data Labeling module.
This notebook was developed on the conda pack with slug: nlp_p37_cpu_v2
data labeling
text classification
Universal Permissive License v 1.0
Updated: 03/30/2023
Perform common data visualization tasks and explore data with the ADS SDK. Plotting approaches include 3D plots, pie chart, GIS plots, and Seaborn pairplot graphs.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
data visualization
seaborn plot
charts
Universal Permissive License v 1.0
Updated: 03/30/2023
Configure and use PySpark to process data in the Oracle Cloud Infrastructure (OCI) Data Catalog metastore, including common operations like creating and loading data from the metastore.
This notebook was developed on the conda pack with slug: pyspark30_p37_cpu_v5
dcat
data catalog metastore
pyspark
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy a generic model
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
generic model
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 06/05/2023
Access
This notebook was developed on the conda pack with slug: pypgx2310_p38_cpu_v1
graph_insight
autonomous_database
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy a huggingface pipeline.
This notebook was developed on the conda pack with slug: pytorch110_p38_cpu_v1
huggingface
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 03/30/2023
Use ADSTuner to optimize an estimator using the scikit-learn API
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
hyperparameter tuning
Universal Permissive License v 1.0
Updated: 03/26/2023
Enhance performance of scikit-learn models using the Intel(R) oneAPI Data Analytics Library. Train a k-means model using both sklearn and the accelerated Intel library and compare performance.
This notebook was developed on the conda pack with slug: sklearnex202130_p37_cpu_v1
intel
intel extension
scikit-learn
scikit learn
Universal Permissive License v 1.0
Updated: 03/27/2023
Connect to Oracle Big Data services using Kerberos.
This notebook was developed on the conda pack with slug: pyspark30_p37_cpu_v5
kerberos
big data service
bds
Universal Permissive License v 1.0
Updated: 03/26/2023
Use the ADS SDK to process and manipulate strings. This notebook includes regular expression matching and natural language (NLP) parsing, including part-of-speech tagging, named entity recognition, and sentiment analysis. It also shows how to create and use custom plugins specific to your specific needs.
This notebook was developed on the conda pack with slug: nlp_p37_cpu_v2
language services
string manipulation
regex
regular expression
natural language processing
NLP
part-of-speech tagging
named entity recognition
sentiment analysis
custom plugins
Universal Permissive License v 1.0
Updated: 05/29/2023
Use Oracle AutoMLx to build a forecast model with real-world data sets.
This notebook was developed on the conda pack with slug: automlx_p38_cpu_v3
language services
string manipulation
regex
regular expression
natural language processing
NLP
part-of-speech tagging
named entity recognition
sentiment analysis
custom plugins
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy a LightGBM model.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
lightgbm
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 03/26/2023
Load data from sources including ADW, Object Storage, and Hive in formats like parquet, csv etc
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
loading data
autonomous database
adw
hive
pandas
dask
object storage
Universal Permissive License v 1.0
Updated: 03/26/2023
A model version set is a way to track the relationships between models. As a container, the model version set takes a collection of models. Those models are assigned a sequential version number based on the order they are entered into the model version set.
This notebook was developed on the conda pack with slug: dbexp_p38_cpu_v1
model
model experiments
model version set
Universal Permissive License v 1.0
Updated: 03/30/2023
Train and evaluate different types of models: binary classification using an imbalanced dataset, multi-class classification using a synthetically generated dataset consisting of three equally distributed classes, and a regression using a synthetically generated dataset with positive targets.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
model evaluation
binary classification
regression
multi-class classification
imbalanced dataset
synthetic dataset
Universal Permissive License v 1.0
Updated: 03/30/2023
Perform model explanations on an NLP classifier using the locally interpretable model explanations technique (LIME).
This notebook was developed on the conda pack with slug: nlp_p37_cpu_v2
nlp
lime
model_explanation
text_classification
text_explanation
Universal Permissive License v 1.0
Updated: 03/30/2023
Load visual data, define regions, and visualize objects using metadata to connect structured images to language.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
object annotation
genome visualization
oracle open data
Universal Permissive License v 1.0 (https://oss.oracle.com/licenses/upl/)
Updated: 07/17/2023
Extract text from common formats (e.g. PDF and Word) into plain text. Customize this process for individual use cases.
This notebook was developed on the conda pack with slug: nlp_p37_cpu_v2
onnx
deploy model
Universal Permissive License v 1.0
Updated: 03/26/2023
Create and use ML pipelines through the entire machine learning lifecycle
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
pipelines
pipeline step
jobs pipeline
Universal Permissive License v 1.0
Updated: 03/26/2023
Use Oracle's Graph Analytics libraries to demonstrate graph algorithms, graph machine learning models, and use the property graph query language (PGQL)
This notebook was developed on the conda pack with slug: pypgx2310_p38_cpu_v1
pypgx
graph analytics
pgx
Universal Permissive License v 1.0
Updated: 03/26/2023
Run interactive Spark workloads on a long lasting Oracle Cloud Infrastructure Data Flow Spark cluster through Apache Livy integration. Data Flow Spark Magic is used for interactively working with remote Spark clusters through Livy, a Spark REST server, in Jupyter notebooks. It includes a set of magic commands for interactively running Spark code.
This notebook was developed on the conda pack with slug: pyspark32_p38_cpu_v2
pyspark
data flow
Universal Permissive License v 1.0
Updated: 03/26/2023
Demonstrates how to use Spark NLP within a long lasting Oracle Cloud Infrastructure Data Flow cluster.
This notebook was developed on the conda pack with slug: pyspark32_p38_cpu_v1
pyspark
data flow
Universal Permissive License v 1.0
Updated: 06/02/2023
Develop local PySpark applications and work with remote clusters using Data Flow.
This notebook was developed on the conda pack with slug: pyspark24_p37_cpu_v3
pyspark
data flow
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy a PyTorch model.
This notebook was developed on the conda pack with slug: pytorch110_p38_cpu_v1
pytorch
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy an scikit-learn model.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
scikit-learn
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 03/30/2023
Use SQL Magic commands to work with a database within a Jupyter notebook. This notebook shows how to to use both line and cell magics.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
sql magic
autonomous database
Universal Permissive License v 1.0
Updated: 03/30/2023
Connect to Oracle Cloud Insfrastructure (OCI) Streaming service with kafka.
This notebook was developed on the conda pack with slug: dataexpl_p37_cpu_v3
streaming
kafka
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy a TensorFlow model.
This notebook was developed on the conda pack with slug: tensorflow28_p38_cpu_v1
tensorflow
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 03/26/2023
Extract text from common formats (e.g. PDF and Word) into plain text. Customize this process for individual use cases.
This notebook was developed on the conda pack with slug: nlp_p37_cpu_v2
text extraction
nlp
Universal Permissive License v 1.0
Updated: 03/26/2023
Train, register, and deploy an XGBoost model.
This notebook was developed on the conda pack with slug: generalml_p38_cpu_v1
xgboost
deploy model
register model
train model
Universal Permissive License v 1.0
Updated: 03/30/2023
Compare training time between CPU and GPU trained models using XGBoost
This notebook was developed on the conda pack with slug: rapids2110_p37_gpu_v1
xgboost
rapids
gpu
machine learning
classification
Universal Permissive License v 1.0