Skip to content

πŸ₯ A Modern and Resilient DHIS2 Client with Built-in WHO-DQR Validation for LMICs

License

HzaCode/pyDHIS2

pydhis2 logo

pydhis2

A modern Python SDK for DHIS2, designed for robust and reproducible scientific workflows.

PyPI version Python versions Downloads Tests Docs License Ruff

πŸ“˜ About

pydhis2 is a next-generation Python library for interacting with DHIS2, the world's largest health information management system. It provides a clean, modern, and efficient API for data extraction, analysis, and management, with a strong emphasis on creating reproducible scientific workflowsβ€”a critical need in public health research and data analysis, especially in Low and Middle-Income Country (LMIC) contexts.

Target Audience:

  • Public health researchers and data scientists
  • DHIS2 implementers and administrators
  • Data analysts working with health information systems
  • Academic researchers requiring reproducible data pipelines

Scientific Use Cases:

  • Epidemiological surveillance and analysis
  • Health system performance monitoring
  • Data quality assessments and validation
  • Routine health data analytics
  • Integration with statistical computing environments (R, Python, Julia)

✨ Why pydhis2?

  • πŸš€ Modern & Asynchronous: Built with asyncio for high-performance, non-blocking I/O, making it ideal for large-scale data operations. A synchronous client is also provided for simplicity in smaller scripts.
  • πŸ”¬ Reproducible by Design: From project templates to a powerful CLI, pydhis2 is built to support standardized, shareable, and verifiable data analysis pipelinesβ€”essential for scientific research.
  • 🐼 Seamless DataFrame Integration: Natively convert DHIS2 analytics data into Pandas DataFrames with a single method call (.to_pandas()), connecting you instantly to the PyData ecosystem.
  • πŸ”§ Powerful Command Line Interface: Automate common tasks like data pulling and configuration directly from your terminal.

πŸš€ Getting Started

1. Installation

Stable Release (Recommended)

Install pydhis2 directly from PyPI:

pip install pydhis2

Development Installation

For contributing or accessing the latest features:

git clone https://github.com/HzaCode/pyDHIS2.git
cd pyDHIS2
pip install -e ".[dev]"

See our Contributing Guide for more details on development setup.

2. Verify Your Installation

Use the built-in CLI to run a quick demo. This will connect to a live DHIS2 server, fetch data, and confirm that your installation is working correctly.

# Check the installed version
pydhis2 version

# Run the quick demo
pydhis2 demo quick

A successful run will produce the following output:

============================================================
pydhis2 Quick Demo
============================================================
=== Testing: https://demos.dhis2.org/dq ===
   Found working API endpoint!
   System: Data Quality
   Version: 2.38.4.3
Found working server: https://demos.dhis2.org/dq

2. Querying Analytics data...
Retrieved 1 data records
...
Demo completed successfully!

πŸ“– Basic Usage

Here is a simple example of how to use pydhis2 in a Python script to fetch analytics data and load it into a Pandas DataFrame.

Create a file named my_analysis.py:

import asyncio
import sys
from pydhis2 import get_client, DHIS2Config
from pydhis2.core.types import AnalyticsQuery

# pydhis2 provides both an async and a sync client
AsyncDHIS2Client, _ = get_client()

async def main():
    # 1. Configure the connection to a DHIS2 server
    config = DHIS2Config(
        base_url="https://demos.dhis2.org/dq",
        auth=("demo", "District1#")
    )
  
    async with AsyncDHIS2Client(config) as client:
        # 2. Define the query parameters
        query = AnalyticsQuery(
            dx=["b6mCG9sphIT"],   # Data element: ANC 1 Outlier Threshold
            ou="qzGX4XdWufs",    # Org unit: A-1 District Hospital
            pe="2023"            # Period: Year 2023
        )

        # 3. Fetch data and convert it directly to a Pandas DataFrame
        df = await client.analytics.to_pandas(query)

        # 4. Analyze and display the results
        print("βœ… Data fetched successfully!")
        print(f"Retrieved {len(df)} records.")
        print("\n--- Data Preview ---")
        print(df.head())

if __name__ == "__main__":
    # Standard fix for asyncio on Windows
    if sys.platform == 'win32':
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    asyncio.run(main())

Run your script from the terminal:

python my_analysis.py

πŸ”§ Server Configuration

While you can pass credentials directly in your script, we recommend using environment variables for better security and flexibility.

1. Environment Variables (Recommended)

export DHIS2_URL="https://your-dhis2-server.com"
export DHIS2_USERNAME="your_username"
export DHIS2_PASSWORD="your_password"

pydhis2 will automatically detect and use these variables.

2. In-Script Configuration

from pydhis2 import DHIS2Config

config = DHIS2Config(
    base_url="https://your-dhis2-server.com",  
    auth=("your_username", "your_password")
)

3. Using the CLI The CLI provides a convenient way to set and cache your credentials.

pydhis2 config --url "https://your-dhis2-server.com" --username "your_username"

πŸ—οΈ A Reproducible Workflow: Project Templates

Beyond being a library, pydhis2 promotes a standardized workflow that is essential for scientific research. To jumpstart your analysis, we provide a project template powered by Cookiecutter.

Why use the template?

  • Standardization: Ensures every project starts with a clean, logical structure.
  • Rapid Start: Generate a fully functional project skeleton in a single command.
  • Best Practices: Includes pre-configured settings for DHIS2 connections, data quality pipelines, and environment management.
  • Focus on Analysis: Spend less time on boilerplate setup and more time on your research.

How to Use

  1. Install Cookiecutter:

    pip install cookiecutter
  2. Generate your project: Point Cookiecutter to the pydhis2 template. It will prompt you for project details.

    cookiecutter gh:HzaCode/pyDHIS2 --directory pydhis2/templates

    You'll be prompted for details like your project name and author:

    project_name [My DHIS-2 Analysis Project]: Malaria Analysis Malawi
    project_slug [malaria_analysis_malawi]:
    author_name [Your Name]: Dr. Evans
    
  3. Get a complete, ready-to-use project structure:

    malaria-analysis-malawi/
    β”œβ”€β”€ configs/          # DHIS-2 & DQR configurations
    β”œβ”€β”€ data/             # Raw and processed data
    β”œβ”€β”€ pipelines/        # Analysis pipeline definitions
    β”œβ”€β”€ scripts/          # Runner scripts
    β”œβ”€β”€ .env.example      # Environment variable template
    └── README.md         # A dedicated README for your new project
    

You can now cd into your new project directory and begin your analysis immediately!

πŸ–₯️ Command Line Interface

pydhis2 provides a powerful CLI for common data operations. (Note: Implementation is in progress)

# Pull analytics data and save as Parquet
pydhis2 analytics pull --dx "b6mCG9sphIT" --ou "qzGX4XdWufs" --pe "2023" --out analytics.parquet

# Pull tracker events
pydhis2 tracker events --program "program_id" --out events.parquet

# Run a data quality review
pydhis2 dqr analyze --input analytics.parquet --html dqr_report.html

For a full list of commands, run pydhis2 --help.

πŸ“Š Supported Endpoints

Endpoint Read Write DataFrame Pagination Streaming
Analytics βœ… - βœ… βœ… βœ…
DataValueSets βœ… βœ… βœ… βœ… βœ…
Tracker Events βœ… βœ… βœ… βœ… βœ…
Metadata βœ… βœ… βœ… - -

πŸ“‹ Compatibility

  • Python: β‰₯ 3.9
  • DHIS2: β‰₯ 2.36
  • Platforms: Windows, Linux, macOS

🀝 Contributing

Contributions are welcome and highly encouraged! pydhis2 is a community-driven project.

Please see our Contributing Guide for details on how to get started. Also, be sure to review our Code of Conduct.

πŸ“ž Community & Support

πŸ“„ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.