GitHub - rxhl/dlt-databricks

DLT with Databricks

In this tutorial, we will create transformations using Delta Live Tables (DLT) in Databricks. Read the full article here.

0. Prereqs

Make sure you have a Databricks account and a cluster up and running.

1. Prepare seed data in Databricks

To create some real transformations, we need to provide seed (raw) data to DLT. We'll manually create a few raw tables in Databricks using scripts available in the seed directory. You can run these scripts directly on a Databricks notebook that is attached to an active cluster. Note that the seed data can also be created in the form of parquet files.

2. Create transformations

Open up a Databricks notebook and run the SQL commands found in the transformations directory
Click on Workflows in the sidebar > Delta Live Tables > Create pipeline
Select the notebook created earlier
Select Triggered for the pipeline mode and hit Create
Click Start on top bar of the pipeline window

Databricks would now start creating the pipeline, populate your medallion tables, and generate a dependency graph. You can modify the pipeline any time including the schedule and target tables.

Reference

DLT quickstart

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
seed		seed
transformations		transformations
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DLT with Databricks

0. Prereqs

1. Prepare seed data in Databricks

2. Create transformations

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

rxhl/dlt-databricks

Folders and files

Latest commit

History

Repository files navigation

DLT with Databricks

0. Prereqs

1. Prepare seed data in Databricks

2. Create transformations

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages