Skip to content

danielkorth/dynamic-scene-graphs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dynamic 3D Scene Graphs from RGB-D

Daniel Korth1,2,*, Xavier Anadon1,3,*, Marc Pollefeys1,4, Zuria Bauer1, Daniel Barath1,5
1ETH Zurich, 2Technical University of Munich, 3University of Zaragoza, 4Microsoft, 5HUN-REN SZTAKI
*Equal contribution

Project Page | Video

teaser.mp4

Work done during 2-month ETH Summer Research Fellowship (SSRF & RSF). Advised by Zuria Bauer and Daniel Barath.

tl;dr: RGB-D recording + camera poses -> SAM2 Video Tracking -> Lift Mask + Features to 3D -> Scene Graph.

Please check the project page for more details.

Installation

conda create -n dsg python=3.10
conda activate dsg
pip install -e .

# installing sam2
cd sam2
pip install -e .

# download sam2 checkpoints
sh scripts/download_sam.sh

Configuration

We use Hydra for configuration management.

Before running any scripts, you need to configure the paths and dataset settings:

Set Environment Variable

Set the PROJECT_ROOT environment variable to point to your project directory:

export PROJECT_ROOT=/path/to/your/dsg/project

Or add it to your shell profile (e.g., ~/.bashrc or ~/.zshrc):

echo "export PROJECT_ROOT=/path/to/your/dsg/project" >> ~/.bashrc
source ~/.bashrc

Directory Structure Setup

Our data structure follows the ZED extraction scripts, but you can use your own RGB-D data. If using different formats, adjust the paths in configs/paths/default.yaml and configs/video_tracking.yaml.

Default structure (from ZED extraction):

data/
  zed/
    your_recording_name/
      images/                    # Original RGB images
      poses.txt                  # Camera poses
      images_undistorted_crop/   # Undistorted RGB + depth images (after undistortion)
        left000000.png          # Undistorted left camera images
        left000001.png
        ...
        leftXXXXXX.png
        depth000000.png         # Undistorted depth images
        depth000001.png
        ...
        depthXXXXXX.png
        intrinsics.txt           # Camera intrinsics (after undistortion)

Run (starting from existing RGB-D + poses)

If you already have RGB-D images and camera poses:

  1. Run SAM2 multitrack segmentation:

    # Process every 10th frame with max 100 frames
    python dsg/video_tracking.py recording=<recording_name> subsample=10 max_frames=100

    Check configs/video_tracking.yaml for all configurations.

  2. Visualize and build scene graph:

    # Basic visualization
    python dsg/viz_rerun.py recording=<recording_name>
    
    # Advanced visualization with graph updates
    python dsg/viz_rerun_teaser.py recording=<recording_name>
    
    # Text-based feature retrieval
    python dsg/viz_clip_similarity.py recording=<recording_name>
    
    # Object reconstruction
    python dsg/viz_obj_reconstruction.py recording=<recording_name>

Run (starting from ZED recording)

If you have a raw ZED recording:

  1. Record data with ZED Mini camera and save as .svo2 file
  2. Extract frames and poses:
    bash scripts/extract_zed.sh
  3. Follow steps above.

Acknowledgments

Our work builds heavily on foundations models such as SAM and CLIP and SALAD. We thank the authors for their work and open-source code.

Citing

@article{korth2025dynamic,
  author    = {Korth, Daniel and Anadon, Xavier and Pollefeys, Marc and Bauer, Zuria and Barath, Daniel},
  title     = {Dynamic 3D Scene Graphs from RGB-D},
  year      = {2025},
}

About

Construct a 3D Scene Graph from RGB-D data and handle dynamics/occlusions - ETH SSRF'25

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors