This repository implements a Gaussian Mixture Model (GMM)–based framework for the segmentation of particle trajectories using stepwise displacement statistics. The method performs a statistical segmentation of displacement time series into distinct regimes.
The framework is fully unsupervised and does not rely on labeled training data or predefined segmentation thresholds. Any physical interpretation of the resulting segments (e.g., as diffusive states) must be performed a posteriori.
The project supports:
- Synthetic trajectory generation with known ground truth, enabling quantitative benchmarking.
- Segmentation of experimental trajectory data, where no ground truth is assumed.
A central feature of the approach is the automatic optimization of a Gaussian temporal filter, selected by minimizing overlap between mixture components.
Many physical and biological systems exhibit state-dependent diffusion, where particles alternate between regimes characterized by distinct diffusion coefficients. Accurately identifying these states from noisy trajectory data is a challenging inference problem.
This framework addresses the problem by:
- Representing trajectories through displacement statistics
- Modeling these statistics using Gaussian mixture models
- Optimizing temporal smoothing to maximize state separability
- Producing a discrete state barcode for each trajectory
-
Synthetic two-state diffusion simulator: data_gen.py
-
GMM segmentation and accuracy evaluation on synthetic data: GMM_Seg_synthetic.py
-
GMM segmentation of experimental trajectory data: GMM_seg_exp.py
-
raw-data/: Generated synthetic datasets
Simulates two-state diffusive trajectories with exponentially distributed state dwell times.
Key features:
- Alternating diffusion coefficients
- Optional temporal blurring (segment averaging)
- Optional localization noise
Applies the GMM-based segmentation pipeline to synthetic trajectories and evaluates accuracy against ground truth.
Applies the same segmentation framework to experimental particle tracking data.
Expected input columns:
- TRACK_ID (Particle label)
- FRAME
- POSITION_X
- POSITION_Y
Synthetic data generation: python data_gen.py
Segmentation of synthetic data: python GMM_Seg_synthetic.py
Segmentation of experimental data: python GMM_seg_exp.py
- Python ≥ 3.8
- NumPy
- SciPy
- scikit-learn
- pandas