Skip to content

TCVAE trajectory embeddings for ATM operational analytics. Extract learned representations from generative models to discover flight patterns, detect outliers, and select representative trajectories without manual feature engineering.

Notifications You must be signed in to change notification settings

SynthAIr/trajcluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TCVAE Trajectory Embeddings for ATM Operational Analytics

This repository implements the Time Series Embedding Applications from "Dual-Purpose ATM Embeddings: From Synthetic Data Generation to Operational Analytics" - demonstrating how embeddings extracted from Temporal Convolutional Variational Autoencoders (TCVAE) can provide operational insights for Air Traffic Management.

πŸ“– Overview

This project shows how generative models trained for trajectory synthesis can serve a dual purpose: their learned embeddings provide powerful analytical capabilities for operational pattern discovery, outlier detection, and representative trajectory extraction - all without manual feature engineering.

Key Applications Demonstrated

  1. Operational Pattern Identification: Discover distinct approach procedures and routing strategies through embedding-based clustering
  2. Trajectory Outlier Detection: Identify anomalous flight paths that deviate from standard operational patterns
  3. Representative Trajectory Extraction: Select core-set trajectories that preserve operational diversity while dramatically reducing dataset size
  4. Similarity Analysis: Quantify operational relationships between trajectories using embedding distances

πŸ”¬ TCVAE Embedding Architecture

The Temporal Convolutional VAE uses dilated causal convolutions to capture multi-scale temporal dependencies in flight trajectories:

  • Encoder: Stacked TCN layers with increasing dilation factors extract hierarchical temporal patterns
  • Latent Space: Fixed-dimensional embeddings preserve essential spatiotemporal characteristics
  • Applications: Embeddings enable downstream analysis tasks without trajectory reconstruction

Embedding Extraction Process

# Extract latent representations from trained TCVAE
latent_vectors = extract_latent_representations(model, trajectory_data)

# Apply analysis techniques
clusters = apply_hdbscan_clustering(latent_vectors)
outliers = detect_trajectory_outliers(latent_vectors, clusters)
representatives = select_representative_paths(latent_vectors, clusters)

πŸš€ Quick Start

Installation

# Install dependencies with Poetry
poetry install && poetry shell

# Start MLflow server for experiment tracking
mlflow server --host 127.0.0.1 --port 5000

Training TCVAE Models

Train embeddings on your trajectory data:

python scripts/train.py \
    --config configs/tcvae_config.yaml \
    --trajectories_dir data/ \
    --model_save_dir saved_models/

Extracting Embeddings and Analysis

Apply embedding-based analysis to discover operational patterns:

python scripts/cluster.py \
    --config configs/tcvae_config.yaml \
    --trajectories_dir data/ \
    --model_save_dir saved_models/ \
    --save_path results/

πŸ“Š Demonstrated Results

Our paper shows embedding applications on three datasets:

Dublin Airport Approaches

  • 7 distinct approach patterns automatically discovered
  • Outlier detection identifies weather diversions and emergency procedures
  • Representative extraction reduces 1000+ trajectories to 7 core examples

London Heathrow Approaches

  • 10 approach clusters reflecting complex airspace structure
  • Multiple arrival corridors captured in embedding space
  • Operational diversity preserved in representative trajectories

Vienna-London End-to-End Routes

  • Route-level patterns showing different airspace utilization strategies
  • Unusual routing patterns flagged as outliers
  • Core-set extraction enables efficient simulation scenarios

πŸ”§ Configuration

Embedding extraction is configured via YAML files:

model:
  type: TCVAE
  encoding_dim: 64            # Embedding dimensionality
  h_dims: [64, 64, 64]        # Encoder hidden layers
  kernel_size: 16             # Temporal receptive field
  dilation_base: 2            # Multi-scale pattern capture
  
data:
  features: ['latitude', 'longitude', 'altitude', 'speed', 'track']
  data_shape: 'image'         # Trajectory representation format
  
train:
  epochs: 500
  accelerator: 'gpu'

πŸ“ˆ Embedding Analysis Pipeline

The framework provides multiple analytical techniques:

1. Dimensionality Reduction

  • PCA: Reduces embedding dimensionality for clustering
  • UMAP/t-SNE: 2D visualization of embedding space structure

2. Pattern Discovery

  • HDBSCAN: Density-based clustering with automatic outlier detection
  • Gaussian Mixture Models: Probabilistic clustering for operational regimes

3. Similarity Analysis

  • Embedding distances: Quantify trajectory operational similarity
  • Medoid selection: Find most representative trajectory per cluster

4. Outlier Detection

  • Density-based: HDBSCAN identifies trajectories outside cluster boundaries
  • Operational significance: Outliers indicate unusual procedures or conditions

🎯 Key Advantages

Automatic Feature Learning

No manual feature engineering required - embeddings capture operational patterns directly from raw trajectory data.

Unified Analysis Framework

Same embedding approach works across different trajectory types (approaches, routes, complete flights).

Computational Efficiency

Once trained, embedding extraction is lightweight - enables real-time operational analysis.

Operational Interpretability

Clusters correspond to meaningful operational patterns; outliers indicate investigation-worthy anomalies.

πŸ“š Data Format

Trajectory data should be provided as pickle files containing Traffic objects with:

# Required trajectory features
features = ['latitude', 'longitude', 'altitude', 'speed', 'track']

# Traffic object structure
trajectories = Traffic.from_file("trajectory_data.pkl")
# Each trajectory: sequence of spatiotemporal measurements

πŸ”¬ Research Applications

This implementation enables:

  • Operational benchmarking: Compare airport approach procedures
  • Safety analysis: Identify unusual trajectory patterns for investigation
  • Efficiency optimization: Extract representative trajectories for scenario planning
  • Simulation: Generate realistic trajectory sets from learned patterns
  • Monitoring: Real-time detection of operational anomalies

πŸ“š Citation

@article{murad2024dual,
  title={Dual-Purpose ATM Embeddings: From Synthetic Data Generation to Operational Analytics},
  author={Murad, Abdulmajid and Ruocco, Massimiliano},
  journal={[arXiv]},
  year={2024}
}

πŸ—οΈ Repository Structure

β”œβ”€β”€ src/trajcluster/
β”‚   β”œβ”€β”€ models/tcvae.py     # TCVAE implementation
β”‚   β”œβ”€β”€ networks/           # TCN and neural network components  
β”‚   β”œβ”€β”€ vae/               # VAE base classes and latent regularization
β”‚   └── utils/             # Embedding extraction and analysis utilities
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ train.py           # Train TCVAE embedding models
β”‚   └── cluster.py         # Extract embeddings and perform analysis
└── configs/               # Model and training configurations

πŸ™ Acknowledgments

This research was conducted within the SynthAIr project, funded by the SESAR Joint Undertaking under the European Union's Horizon Europe research and innovation program (grant agreement No. 101114847).

About

TCVAE trajectory embeddings for ATM operational analytics. Extract learned representations from generative models to discover flight patterns, detect outliers, and select representative trajectories without manual feature engineering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages