Skip to content

rxhl/dlt-databricks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DLT with Databricks

In this tutorial, we will create transformations using Delta Live Tables (DLT) in Databricks. Read the full article here.

Tables

0. Prereqs

Make sure you have a Databricks account and a cluster up and running.

1. Prepare seed data in Databricks

To create some real transformations, we need to provide seed (raw) data to DLT. We'll manually create a few raw tables in Databricks using scripts available in the seed directory. You can run these scripts directly on a Databricks notebook that is attached to an active cluster. Note that the seed data can also be created in the form of parquet files.

2. Create transformations

  • Open up a Databricks notebook and run the SQL commands found in the transformations directory
  • Click on Workflows in the sidebar > Delta Live Tables > Create pipeline
  • Select the notebook created earlier
  • Select Triggered for the pipeline mode and hit Create
  • Click Start on top bar of the pipeline window

Databricks would now start creating the pipeline, populate your medallion tables, and generate a dependency graph. You can modify the pipeline any time including the schedule and target tables.

Reference

DLT quickstart

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors