A robust, automated makefile-based tool for creating and managing local Python environments that match specific Databricks Serverless Environment Versions.
# Install prerequisites
# Install uv: https://github.com/astral-sh/uv
# Create environment for Databricks version 4
make env ENV_VER=4
# Activate the environment
source .venv-db4/bin/activateThat's it! You now have a local Python environment matching Databricks Environment Version 4.
- Automatic Python Version Detection: Dynamically fetches the correct Python version from Databricks documentation
- Smart Package Management:
- Removes Ubuntu-specific/system packages that won't work on macOS
- Handles binary-only packages gracefully (installs if available, skips if not)
- Cleans Databricks-specific version suffixes from packages
- Version Validation: Only allows creation of environments for valid Databricks versions
- Modular Pipeline: Separate targets for each step (requirements, Python install, venv setup, dependencies)
- Lock File Generation: Creates
requirements-env-X.lockfor reproducible environments - Clean Management: Easy cleanup for specific versions or all environments
- Comprehensive Testing: Built-in test suite to validate functionality
- uv - Fast Python package installer and environment manager
# Install uv (macOS/Linux) curl -LsSf https://astral.sh/uv/install.sh | sh # Add to PATH export PATH="$HOME/.local/bin:$PATH"
- make - Usually pre-installed on macOS/Linux
- curl - For downloading requirements files
- Internet connection - For fetching Databricks documentation and packages
# Show all available commands
make help
# List available Databricks environment versions
make list-versions
# Create complete environment (default: version 4)
make env ENV_VER=4
# Clean up specific version
make clean ENV_VER=4
# Clean up all environments
make clean-allFor more control over the process:
# 1. Download and process requirements
make requirements ENV_VER=4
# 2. Detect and install Python version
make python-version ENV_VER=4
make install-python ENV_VER=4
# 3. Create virtual environment
make setup-venv ENV_VER=4
# 4. Install dependencies
make install-deps ENV_VER=4
# 5. Generate lock file
make create-lockfile ENV_VER=4Once created, activate and use your environment:
# Activate
source .venv-db4/bin/activate
# Verify Python version
python --version
# Test imports
python -c "import pandas, numpy, pyspark; print('All imports successful!')"
# Deactivate
deactivateWhen you run make env ENV_VER=4, the following files are created:
.venv-db4/ # Virtual environment directory
requirements-env-4.txt # Processed requirements file
requirements-env-4.txt.binary # Binary-only packages (internal use)
requirements-env-4.lock # Lock file with installed packages
Change the default version by modifying ENV_VER in the makefile:
ENV_VER ?= 4 # Change to your preferred defaultThe makefile automatically excludes packages that won't work on macOS. To modify the list, edit:
EXCLUDED_PACKAGES = unattended-upgrades|ssh-import-id|...Packages that require binary wheels (will be skipped if unavailable):
BINARY_ONLY_PACKAGES = pyodbcCurrently supported versions (automatically fetched from Microsoft documentation):
- Version 1
- Version 2
- Version 3
- Version 4
Run make list-versions to see the latest available versions.
The following packages are removed because they're Ubuntu/system-specific or lack ARM64 macOS wheels:
unattended-upgrades- Ubuntu system packagessh-import-id- Ubuntu utilitydbus-python- Requires D-Bus system librarypsycopg2- PostgreSQL library requiring system dependenciespsutil- No compatible wheels for some versionsPyGObject,pycairo- GTK bindingswadllib,lazr.uri,lazr.restfulclient- Launchpad utilitiesgoogle-api-core- Compatibility issues
Packages attempted with --only-binary (skipped if no wheel available):
pyodbc- ODBC database connector
Databricks-specific version suffixes are automatically removed:
pyspark==4.0.0+databricks.connect.17.0.1 β pyspark==4.0.0
Run the full test suite:
make testOr run the test script directly:
./test_makefile.shFor quick validation (30 seconds):
./quick_test.sh- β
uvis installed and accessible - β
makeis installed - β
makefileexists
- β
make helpdisplays correctly - β
make list-versionsreturns available versions
- β
make validate-versionaccepts valid versions (e.g., 4) - β
make validate-versionrejects invalid versions (e.g., 999) - β
make check-uvverifies uv installation
- β
make requirementsdownloads requirements file - β Requirements file is created and not empty
- β Excluded packages are removed (psycopg2, psutil, dbus-python, etc.)
- β Databricks version suffixes are cleaned from pyspark
- β Binary-only packages are separated
- β
make python-versionextracts correct Python version from docs
- β
make envcreates complete environment - β Virtual environment directory is created
- β Python executable exists and is functional
- β Lock file is created and contains packages
- β Key packages are installed (pandas, numpy, pyspark)
- β
make cleanremoves files for specific version - β
make clean-allremoves all generated files
The test suite provides colored output:
- π‘ YELLOW: Test being run
- π’ GREEN: Test passed
- π΄ RED: Test failed
Example output:
TEST: Testing 'make requirements' target (ENV_VER=4)
β PASS: requirements target executes successfully
β PASS: requirements-env-4.txt file created
β PASS: requirements-env-4.txt is not empty
========================================================================
TEST RESULTS SUMMARY
========================================================================
Total tests run: 25
Tests passed: 25
Tests failed: 0
========================================================================
All tests passed! β
./quick_test.shValidates basic functionality without creating full environment.
make testComprehensive tests including full environment creation and validation.
Test individual targets:
make help
make list-versions
make validate-version ENV_VER=4
make requirements ENV_VER=4You can modify test_makefile.sh to run specific tests by commenting out sections you don't want to run.
This repository includes two automated workflows:
Runs on:
- Push to
mainbranch β Quick tests only (~30 sec) - Pull requests to
mainβ Full test suite (~5-10 min)
What it does:
- β
Installs
uv - β Runs quick tests (always)
- β Runs full test suite (PRs only)
- β Uploads logs on failure
- β Reports results in GitHub summary
Runs on:
- Pull request opened/updated
What it does:
- β Quick validation (syntax, version detection, requirements)
- β Posts comment on PR with results
- β Fast feedback (~1 minute)
For other CI/CD systems:
# Example GitHub Actions
- name: Test Makefile
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
make test
# Example GitLab CI
test:
script:
- curl -LsSf https://astral.sh/uv/install.sh | sh
- export PATH="$HOME/.local/bin:$PATH"
- make testIf the full environment creation test hangs, it will timeout after 10 minutes. Check /tmp/make_env_output.log for details.
If tests fail with "uv is not installed", ensure uv is in your PATH:
export PATH="$HOME/.local/bin:$PATH"
make testThe test script automatically runs make clean-all between major tests to ensure a clean state.
To add new tests to test_makefile.sh:
- Use
print_test "Test description"to start a new test - Run your test command
- Use
pass "message"for successful assertions - Use
fail "message"for failed assertions
Example:
print_test "Testing custom target"
if make custom-target >/dev/null 2>&1; then
pass "custom-target executed successfully"
else
fail "custom-target failed"
fiFull test suite typically takes:
- Fast tests (validation, help, etc.): ~10 seconds
- Full environment creation: 3-5 minutes
- Total runtime: ~5-10 minutes
To skip the slow full environment test, comment out that section in test_makefile.sh.
Install uv and add it to your PATH:
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"This usually means:
- Network connectivity issues
- Databricks documentation format changed
- Invalid environment version number
Try running make list-versions to see available versions.
This error shouldn't occur with the current version. If you see it, ensure you're using the latest makefile.
If specific packages fail to install:
- Check if it's a binary-only package that lacks ARM64 wheels
- Add it to
EXCLUDED_PACKAGESorBINARY_ONLY_PACKAGESin the makefile - The environment will still be created with other packages
Make sure you're using the correct command for your shell:
# bash/zsh
source .venv-db4/bin/activate
# fish
source .venv-db4/bin/activate.fish
# csh/tcsh
source .venv-db4/bin/activate.csh.
βββ makefile # Main automation script
βββ README.md # This file
βββ test_makefile.sh # Full test suite
βββ quick_test.sh # Quick validation tests
βββ .venv-db{X}/ # Virtual environments (generated)
βββ requirements-env-{X}.txt # Processed requirements (generated)
βββ requirements-env-{X}.txt.binary # Binary-only packages (generated)
βββ requirements-env-{X}.lock # Lock files (generated)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β make env ENV_VER=4 β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Check Prerequisites β
β - Validate uv β
β - Validate version β
βββββββββββββ¬βββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Download Requirements β
β - Fetch from MS docs β
β - Process packages β
β - Exclude system pkgs β
βββββββββββββ¬βββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Detect Python Ver β
β - Parse from docs β
β - Install with uv β
βββββββββββββ¬βββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Create Virtual Env β
β - uv venv with ver β
βββββββββββββ¬βββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Install Dependencies β
β - Main packages β
β - Binary-only (skip) β
βββββββββββββ¬βββββββββββββ
β
βΌ
ββββββββββββββββββββββββββ
β Generate Lock File β
β - uv pip freeze β
βββββββββββββ¬βββββββββββββ
β
βΌ
β
Done!
- Always specify the version:
make env ENV_VER=4is clearer than relying on defaults - Check available versions first: Run
make list-versionsbefore creating an environment - Test incrementally: Use individual targets (
make requirements,make setup-venv) when debugging - Keep environments separate: Use different
ENV_VERvalues for different projects - Use lock files: Commit
requirements-env-X.lockto ensure reproducible environments - Clean regularly: Run
make clean-allto remove old environments and free up disk space
To contribute improvements:
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-improvement - Make your changes to the makefile
- Run the test suite:
make test(locally) - Ensure all tests pass
- Commit your changes:
git commit -m "Add my improvement" - Push to your fork:
git push origin feature/my-improvement - Create a Pull Request
- The Validate PR workflow will run automatically
- Review the validation results posted as a comment
- Once approved and merged, the Test Makefile workflow runs on main
After cloning/creating this repository:
-
Update badge URLs in README.md:
Replace YOUR_USERNAME/YOUR_REPO with your actual GitHub username and repository name
-
Enable GitHub Actions:
- Go to repository Settings β Actions β General
- Ensure "Allow all actions and reusable workflows" is selected
-
First Push:
git add . git commit -m "Initial commit" git push origin main
The workflows will run automatically!
This tool is provided as-is for use with Databricks environments.
Q: Why use this instead of pip/conda?
A: This tool automatically matches Databricks environments exactly, handling version-specific quirks and platform differences.
Q: Can I use this on Linux/Windows?
A: It's designed for macOS but should work on Linux. Windows support via WSL2 is untested.
Q: What if a package I need was excluded?
A: You can manually install it in the venv, or modify EXCLUDED_PACKAGES in the makefile if you know how to handle dependencies.
Q: How do I update an existing environment?
A: Run make clean ENV_VER=X followed by make env ENV_VER=X to rebuild from scratch.
Q: Can I use this for multiple Databricks workspace versions?
A: Yes! Create separate environments: make env ENV_VER=1, make env ENV_VER=4, etc.
Made with β€οΈ for Databricks developers