Skip to content

Export artifacts workflow and decoder export fixes#3

Open
rbavery wants to merge 13 commits intomainfrom
export-artifacts
Open

Export artifacts workflow and decoder export fixes#3
rbavery wants to merge 13 commits intomainfrom
export-artifacts

Conversation

@rbavery
Copy link
Member

@rbavery rbavery commented Feb 5, 2026

Summary

  • end to end single pt2 export of SAM3 model, including
    • image encoder, text encoder, encoder feature map fusion, decoder
  • add integration tests for each sub-module and comfirm output equality to eager mode.
  • tests input dynamism for batch size, number of prompts, and variable image size is supported with an extra argument for a padding mask to pad to fixed size.
  • per module and export/load timing benchmarks
  • allow configuring num_feature_levels, which dictates how many feature maps the encoder produces. more feature maps = better performance on different object scales. This exposes this config export and benchmark script CLI flags. but only for some modules, this is not supported for the end to end SAM3 pipeline right now

Timing Results (CUDA 3090 24GB)

Feature levels = 1
Input image: 1008x1008 (fixed shape that sam3 supports by default)

  • Inference (full pipeline): 0.435s

  • Inference (image encoder): 0.297s

  • Inference (text encoder): 0.011s

  • Inference (encoder fusion): 0.081s

  • Inference (decoder only): 0.050s

  • Load full pipeline to CUDA: 19.848s

Feature levels = 2

CPU export status

  • Attempted SAM3_EXPORT_FORCE_CPU=1 export for test_decoder_export_static.
  • Result: failed with GuardOnDataDependentSymNode in scaled_dot_product_attention (decoder RPB cross-attn). CPU export is not currently reliable.

Model/data changes

  • sam3/model/decoder.py: rework presence-token masking and cross-attn path to support RPB masks via scaled_dot_product_attention without MHA guard issues; relax coordinate cache asserts for export stability.
  • sam3/model/encoder.py: fix tensor-dim check (x.dim() vs x.dim).
  • sam3/model/geometry_encoders.py: guard pin_memory and fix ROIAlign input layout to avoid export issues.
  • sam3/model/position_encoding.py: avoid caching with symbolic shapes to prevent export SymInt guard errors.
  • sam3/model_builder.py: thread num_feature_levels into transformer/model construction for dynamic experiments (defaults unchanged).
  • sam3/train/data/*: cleanup unused loop indices (no behavior change).

Notes

  • Multi‑feature‑level support is limited: the decoder asserts spatial_shapes.shape[0] == 1 when box RPB is enabled, so full pipeline/decoder‑only export and inference fail for num_feature_levels > 1. This likely impacts small‑object performance relative to a true multi‑scale feature stack.
  • Load timing requires torchvision.ops to be imported to register roi_align before torch.export.load.

generatedunixname537391475639613 and others added 11 commits January 27, 2026 04:54
Reviewed By: JuanBesa

Differential Revision: D91383480

fbshipit-source-id: 5b98627fb679c7c704c1a2faba9722e3a6f2ec20
Reviewed By: JuanBesa

Differential Revision: D91210167

fbshipit-source-id: a563232f4bc82f6f3b99e53df1c88cf0f39747bb
Simplify benchmarks/tests to avoid duplicated work and keep export usage consistent.
Introduce minimal export/test scripts for the full pipeline and add dedicated export/load timing benchmarks.
Allow configuring num_feature_levels for image models and add/extend export, load, and inference benchmarks.
Remove redundant checks and keep feature-level guard output concise.
Remove tracked export stderr logs and export progress doc, ignore new logs, and mark export tests with the slow marker.
Declare the slow pytest marker to avoid warnings for export benchmarks.
@rbavery rbavery marked this pull request as ready for review February 7, 2026 02:25
Use 2-input full pipeline export in tests and scripts, log output shapes, and add a standalone 2-input export helper. This keeps prompt count dynamic for export artifacts while preserving benchmarks and artifact validation.
Enable 2-input full pipeline export
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant