Delta Live Tables Example Notebooks

Delta Live Tables is a new framework designed to enable customers to successfully declaratively define, deploy, test & upgrade data pipelines and eliminate operational burdens associated with the management of such pipelines.

This repo contains Delta Live Table examples designed to get customers started with building, deploying and running pipelines.

Getting Started

Connect your Databricks workspace using the feature to this repo
Choose one of the examples and create your pipeline!

Examples

Wikipedia

The Wikipedia clickstream sample is a great way to jump start using Delta Live Tables (DLT). It is a simple bificating pipeline that creates a table on your JSON data, cleanses the data, and then creates two tables.

This sample is available for both SQL and Python.

Running your pipeline

1. Create your pipeline using the following parameters

From your Databricks workspace, click Jobs, then Delta Live Tables and click on Create Pipeline
Fill in the Pipeline Name, e.g. Wikipedia
For the Notebook Libraries, fill in the path of the notebook such as /Repos/michael@databricks.com/delta-live-tables-notebooks/SQL/Wikipedia
To publish your tables, add the target parameter to specify which database you want to persist your tables, e.g. wiki_demo.

2. Edit your pipeline JSON

Once you have setup your pipeline, click Edit Settings near the top, the JSON will look similar to below

3. Click Start

To view the progress of your pipeline, refer to the progress flow near the bottom of the pipeline details UI as noted in the following image.

4. Reviewing the results

Once your pipeline has completed processing, you can review the data by opening up a new Databricks notebook and running the following SQL statements:
```
%sql
-- Review the top referrers to Wikipedia's Apache Spark articles
SELECT * FROM wiki_demo.top_spark_referers
```
Unsurprisingly, the top referrer is "Google" which you can see graphically when you convert your table into an area chart.

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
apply-changes-from-snapshot-demo		apply-changes-from-snapshot-demo
applyInPandasWithState-integral-calculus		applyInPandasWithState-integral-calculus
change-data-capture-example		change-data-capture-example
customer360-fivetran-dlt-demo		customer360-fivetran-dlt-demo
divvy-bike-demo		divvy-bike-demo
dlt-meta-demo		dlt-meta-demo
dlt-serverless-benchmarks		dlt-serverless-benchmarks
dms-dlt-cdc-demo		dms-dlt-cdc-demo
financial-services-examples/Personalization		financial-services-examples/Personalization
images		images
kafka-dlt-streaminganalytics		kafka-dlt-streaminganalytics
mapInPandas-dlt-ingestion		mapInPandas-dlt-ingestion
ml models		ml models
motion-demo		motion-demo
python		python
sql		sql
twitter-dlt-huggingface-demo		twitter-dlt-huggingface-demo
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Delta Live Tables Example Notebooks

Getting Started

Examples

Wikipedia

Running your pipeline

About

Releases

Packages

Languages

ziafazal/delta-live-tables-notebooks

Folders and files

Latest commit

History

Repository files navigation

Delta Live Tables Example Notebooks

Getting Started

Examples

Wikipedia

Running your pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages