Skip to content

Add DTW, nDTW, and SDTW trajectory metrics#12

Open
nalinraut wants to merge 1 commit intoAmeyaWagh:mainfrom
nalinraut:feature/dtw-metrics
Open

Add DTW, nDTW, and SDTW trajectory metrics#12
nalinraut wants to merge 1 commit intoAmeyaWagh:mainfrom
nalinraut:feature/dtw-metrics

Conversation

@nalinraut
Copy link

Add Dynamic Time Warping based metrics for evaluating trajectories that may have different lengths or temporal alignment. These metrics are particularly useful for evaluating VLA models and policies using action chunking (e.g., ACT, Diffusion Policy).

New metrics:

  • DTWDistance: Raw DTW distance using dynamic programming (lower=better)
  • NormalizedDTW: Mapped to [0,1] using exp(-DTW/(|R|*d)) (higher=better)
  • SuccessWeightedDTW: nDTW weighted by task success (SDTW = nDTW * Success)

Key features:

  • Support for trajectories of different lengths (core advantage over MSE/ATE)
  • Tolerates temporal misalignment (hesitation, speed differences)
  • Optional custom normalization factor
  • Full torchmetrics.Metric compatibility with distributed training support
  • Comprehensive test suite and example usage

Reference: Ilharco et al., "General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping," arXiv:1907.05446, NeurIPS ViGIL Workshop, 2019.

Add Dynamic Time Warping based metrics for evaluating trajectories that
may have different lengths or temporal alignment. These metrics are
particularly useful for evaluating VLA models and policies using action
chunking (e.g., ACT, Diffusion Policy).

New metrics:
- DTWDistance: Raw DTW distance using dynamic programming (lower=better)
- NormalizedDTW: Mapped to [0,1] using exp(-DTW/(|R|*d)) (higher=better)
- SuccessWeightedDTW: nDTW weighted by task success (SDTW = nDTW * Success)

Key features:
- Support for trajectories of different lengths (core advantage over MSE/ATE)
- Tolerates temporal misalignment (hesitation, speed differences)
- Optional custom normalization factor
- Full torchmetrics.Metric compatibility with distributed training support
- Comprehensive test suite and example usage

Reference: Ilharco et al., "General Evaluation for Instruction Conditioned
Navigation using Dynamic Time Warping," arXiv:1907.05446, NeurIPS ViGIL
Workshop, 2019.
from torchmetrics import Metric


def _compute_dtw(predicted: Tensor, reference: Tensor) -> Tensor:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be an independenct function? can this be part of the metric class?

accumulated[0, 0] = cost_matrix[0, 0]

# Initialize first column
for i in range(1, t_pred):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid loops and use torch.linspace to index instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants