its-ammu
diff --git a/‎README.md‎
Lines changed: 127 additions & 60 deletions b/‎README.md‎
Lines changed: 127 additions & 60 deletions
diff --git a/‎assets/dashboard-home.png‎
173 KB b/‎assets/dashboard-home.png‎
173 KB
diff --git a/‎assets/data-pipeline-details.png‎
1.89 MB b/‎assets/data-pipeline-details.png‎
1.89 MB
diff --git a/‎assets/flow-details.png‎
-2.09 MB b/‎assets/flow-details.png‎
-2.09 MB
diff --git a/‎assets/flow-run-history.png‎
-2.58 MB b/‎assets/flow-run-history.png‎
-2.58 MB
diff --git a/‎assets/ml-pipeline-details.png‎
1.18 MB b/‎assets/ml-pipeline-details.png‎
1.18 MB
diff --git a/‎dashboard/app.py‎
Lines changed: 67 additions & 3 deletions b/‎dashboard/app.py‎
Lines changed: 67 additions & 3 deletions
@@ -12,10 +12,65 @@ This project implements a data pipeline that analyzes global energy consumption
 - Automatically saves outputs as GitHub Actions artifacts
 - Flask-based monitoring dashboard for pipeline status
 - Custom REST APIs for pipeline monitoring
+- ML pipeline monitoring and analytics
+
+## Dashboard App
+
+The project includes a Flask-based dashboard application that provides real-time monitoring of both data and ML pipelines. The dashboard offers:
+
+1. **Pipeline Overview**:
+   - List of all available data and ML pipelines
+   - Quick status indicators
+   - Creation and last update timestamps
+   - Pipeline tags and labels
+
+2. **Pipeline Details**:
+   - Comprehensive run history
+   - Success rate statistics
+   - Average run time metrics
+   - Error analysis and common failure patterns
+   - Detailed status distribution
+
+3. **Visual Features**:
+   - Modern, responsive UI
+   - Interactive cards with hover effects
+   - Color-coded status indicators
+   - Real-time status updates
+   - Detailed run history table
+
+### Dashboard Screenshots
+
+#### Homepage
+![Dashboard Homepage](assets/dashboard-home.png)
+
+#### Data Pipeline Details
+![Data Pipeline Details](assets/data-pipeline-details.png)
+
+#### ML Pipeline Details
+![ML Pipeline Details](assets/ml-pipeline-details.png)
+
+### Running the Dashboard
+
+1. Navigate to the dashboard directory:
+```bash
+cd dashboard
+```
+
+2. Install dashboard dependencies:
+```bash
+pip install -r requirements.txt
+```
+
+3. Start the Flask application:
+```bash
+python app.py
+```
+
+4. Access the dashboard at http://localhost:5000
 
 ## Custom APIs
 
-The project includes custom REST APIs built with AWS Lambda to interact with the Prefect Cloud pipelines. These APIs provide real-time access to pipeline information and status.
+The project includes custom REST APIs built with AWS Lambda to interact with the Prefect Cloud pipelines and SageMaker ML pipelines. These APIs provide real-time access to pipeline information and status.
 
 ### API Documentation
 
@@ -24,42 +79,67 @@ The complete API documentation is available at:
 
 ### Available Endpoints
 
-1. **Get All Pipelines**
+1. **Get All Data Pipelines**
    ```
    GET https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/data/pipelines
    ```
-   - Returns a list of all pipelines running in Prefect Cloud
+   - Returns a list of all data pipelines running in Prefect Cloud
    - Response: List of pipeline objects with metadata
 
-2. **Get Pipeline Status**
+2. **Get Data Pipeline Status**
    ```
    GET https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/data/pipelines/status?id={pipeline_id}
    ```
-   - Returns detailed status information for a specific pipeline
+   - Returns detailed status information for a specific data pipeline
    - Parameters:
      - `id`: The unique identifier of the pipeline
    - Response: Detailed pipeline status including run history and metrics
 
+3. **Get All ML Pipelines**
+   ```
+   GET https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/ml/pipeline
+   ```
+   - Returns a list of all ML pipelines running in SageMaker
+   - Response: List of ML pipeline objects with metadata
+
+4. **Get ML Pipeline Status**
+   ```
+   GET https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/ml/pipeline/status?pipeline_id={pipeline_id}
+   ```
+   - Returns detailed status information for a specific ML pipeline
+   - Parameters:
+     - `pipeline_id`: The unique identifier of the ML pipeline
+   - Response: Detailed ML pipeline execution history and metrics
+
 ### API Usage Example
 
 ```python
 import requests
 
-# Get all pipelines
+# Get all data pipelines
 response = requests.get('https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/data/pipelines')
-pipelines = response.json()
+data_pipelines = response.json()
 
-# Get status of a specific pipeline
+# Get status of a specific data pipeline
 pipeline_id = "your-pipeline-id"
 response = requests.get(f'https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/data/pipelines/status?id={pipeline_id}')
 status = response.json()
+
+# Get all ML pipelines
+response = requests.get('https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/ml/pipeline')
+ml_pipelines = response.json()
+
+# Get status of a specific ML pipeline
+ml_pipeline_id = "your-ml-pipeline-id"
+response = requests.get(f'https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/ml/pipeline/status?pipeline_id={ml_pipeline_id}')
+ml_status = response.json()
 ```
 
 ### API Response Format
 
 The APIs return JSON responses with the following structure:
 
-1. **Pipelines List Response**:
+1. **Data Pipelines List Response**:
 ```json
 [
   {
@@ -73,7 +153,7 @@ The APIs return JSON responses with the following structure:
 ]
 ```
 
-2. **Pipeline Status Response**:
+2. **Data Pipeline Status Response**:
 ```json
 [
   {
@@ -92,60 +172,43 @@ The APIs return JSON responses with the following structure:
 ]
 ```
 
-## Dashboard App
-
-The project includes a Flask-based dashboard application that provides real-time monitoring of pipeline runs and their status. The dashboard offers:
-
-1. **Pipeline Overview**:
-   - List of all available pipelines
-   - Quick status indicators
-   - Creation and last update timestamps
-   - Pipeline tags and labels
-
-2. **Pipeline Details**:
-   - Comprehensive run history
-   - Success rate statistics
-   - Average run time metrics
-   - Error analysis and common failure patterns
-   - Detailed status distribution
-
-3. **Visual Features**:
-   - Modern, responsive UI
-   - Interactive cards with hover effects
-   - Color-coded status indicators
-   - Real-time status updates
-   - Detailed run history table
-
-### Dashboard Screenshots
-
-#### Homepage
-![Dashboard Homepage](assets/dashboard-home.png)
-
-#### Pipeline Details
-![Pipeline Details](assets/flow-details.png)
-
-#### Run History
-![Flow Run History](assets/flow-run-history.png)
-
-### Running the Dashboard
-
-1. Navigate to the dashboard directory:
-```bash
-cd dashboard
-```
-
-2. Install dashboard dependencies:
-```bash
-pip install -r requirements.txt
+3. **ML Pipelines List Response**:
+```json
+[
+  {
+    "PipelineArn": "arn:aws:sagemaker:region:account:pipeline/pipeline-name",
+    "PipelineName": "pipeline-name",
+    "PipelineDisplayName": "Pipeline Display Name",
+    "CreationTime": "timestamp",
+    "LastModifiedTime": "timestamp",
+    "LastExecutionTime": "timestamp"
+  }
+]
 ```
 
-3. Start the Flask application:
-```bash
-python app.py
+4. **ML Pipeline Status Response**:
+```json
+{
+  "PipelineExecutionSummaries": [
+    {
+      "PipelineExecutionArn": "arn:aws:sagemaker:region:account:pipeline/pipeline-name/execution/execution-id",
+      "StartTime": "timestamp",
+      "PipelineExecutionStatus": "Succeeded|Failed|Executing",
+      "PipelineExecutionDisplayName": "Execution Name",
+      "PipelineExecutionDetails": {
+        "PipelineArn": "arn:aws:sagemaker:region:account:pipeline/pipeline-name",
+        "PipelineExecutionStatus": "Succeeded|Failed|Executing",
+        "CreationTime": "timestamp",
+        "LastModifiedTime": "timestamp",
+        "CreatedBy": {
+          "UserProfileName": "user-name"
+        }
+      }
+    }
+  ]
+}
 ```
 
-4. Access the dashboard at http://localhost:5000
-
 ## Output Artifacts
 
 The pipeline generates three main artifacts that are saved in the `output` directory and uploaded as GitHub Actions artifacts:
@@ -248,6 +311,10 @@ To run the pipeline manually:
   - `templates/`: HTML templates for the dashboard
   - `static/`: CSS and JavaScript files
   - `requirements.txt`: Dashboard-specific dependencies
+- `lambdas/`: AWS Lambda functions for custom APIs
+  - `data/`: Data pipeline API functions
+  - `ml/`: ML pipeline API functions
+  - `docs.json`: API documentation
 - `README.md`: This file
 
 ## Data Analysis Features
 
@@ -14,6 +14,15 @@ def get_pipeline_details(pipeline_id):
     response = requests.get(f'https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/data/pipeline/status?id={pipeline_id}')
     return response.json()
 
+def get_ml_pipelines():
+    response = requests.get('https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/ml/pipeline')
+    print("ML Pipelines:", response.json())
+    return response.json()
+
+def get_ml_pipeline_details(pipeline_id):
+    response = requests.get(f'https://es3ozkq7i8.execute-api.us-east-1.amazonaws.com/dev/ml/pipeline/status?pipeline_id={pipeline_id}')
+    return response.json()
+
 def analyze_pipeline_runs(runs):
     # Convert string timestamps to datetime objects
     for run in runs:
@@ -46,12 +55,47 @@ def analyze_pipeline_runs(runs):
         'runs': sorted(runs, key=lambda x: x['created'], reverse=True)
     }
 
-# pipelines : [{\"id\": \"4265d3d9-26cc-42ac-8fb7-b8e786796584\", \"created\": \"2025-03-24T14:47:28.193399Z\", \"updated\": \"2025-03-24T14:47:28.193419Z\", \"name\": \"Global Energy Transition Analysis Pipeline\", \"tags\": [], \"labels\": {}}, {\"id\": \"32e6109b-56eb-4f79-ae9c-a5513e800495\", \"created\": \"2025-03-24T12:55:48.498668Z\", \"updated\": \"2025-03-24T12:55:48.498685Z\", \"name\": \"Simple Data Pipeline\", \"tags\": [], \"labels\": {}}]
+def analyze_ml_pipeline_runs(executions):
+    if not executions or 'PipelineExecutionSummaries' not in executions:
+        return {
+            'total_runs': 0,
+            'status_counts': {},
+            'avg_run_time': 0,
+            'success_rate': 0,
+            'latest_run': None,
+            'error_types': {},
+            'runs': []
+        }
+    
+    runs = executions['PipelineExecutionSummaries']
+    
+    # Calculate statistics
+    total_runs = len(runs)
+    status_counts = Counter(run['PipelineExecutionStatus'] for run in runs)
+    success_rate = (status_counts['Succeeded'] / total_runs * 100) if total_runs > 0 else 0
+    
+    # Get latest run
+    latest_run = max(runs, key=lambda x: x['StartTime']) if runs else None
+    
+    # Get error types
+    error_types = Counter(run['PipelineExecutionStatus'] for run in runs if run['PipelineExecutionStatus'] not in ['Succeeded'])
+    
+    return {
+        'total_runs': total_runs,
+        'status_counts': dict(status_counts),
+        'success_rate': round(success_rate, 2),
+        'latest_run': latest_run,
+        'error_types': dict(error_types),
+        'runs': sorted(runs, key=lambda x: x['StartTime'], reverse=True)
+    }
 
 @app.route('/')
 def home():
-    pipelines = get_pipelines()
-    return render_template('index.html', pipelines=pipelines)
+    data_pipelines = get_pipelines()
+    ml_pipelines = get_ml_pipelines()
+    return render_template('index.html', 
+                         data_pipelines=data_pipelines,
+                         ml_pipelines=ml_pipelines)
 
 @app.route('/pipeline/<pipeline_id>')
 def pipeline_details(pipeline_id):
@@ -73,5 +117,25 @@ def pipeline_details(pipeline_id):
                          pipeline=pipeline_info,
                          analysis=analysis)
 
+@app.route('/ml/pipeline/<pipeline_id>')
+def ml_pipeline_details(pipeline_id):
+    executions = get_ml_pipeline_details(pipeline_id)
+    if not executions or 'PipelineExecutionSummaries' not in executions:
+        return "Pipeline not found", 404
+    
+    # Get pipeline info from the first execution
+    pipeline_info = {
+        'id': executions['PipelineExecutionSummaries'][0]['PipelineExecutionDetails']['PipelineArn'].split('/')[-1],
+        'name': executions['PipelineExecutionSummaries'][0]['PipelineExecutionDetails']['PipelineArn'].split('/')[-1],
+        'display_name': executions['PipelineExecutionSummaries'][0]['PipelineExecutionDisplayName']
+    }
+    
+    # Analyze the executions
+    analysis = analyze_ml_pipeline_runs(executions)
+    
+    return render_template('ml_pipeline_details.html', 
+                         pipeline=pipeline_info,
+                         analysis=analysis)
+
 if __name__ == '__main__':
     app.run(debug=True)