-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
Currently the signal-derived pose estimation detects at most 1 person from a single ESP32 CSI stream. When 2+ people are in the room, only one is reported. This was confirmed during live testing with ESP32 hardware connected — GET /api/v1/pose/current returns 1 person even with 2 people present.
Root Cause
The current derive_pose_from_sensing() function in sensing-server/src/main.rs generates a single synthetic skeleton from aggregate CSI features (motion score, dominant frequency, spectral centroid). It has no mechanism to:
- Separate individual contributions from the CSI amplitude/phase data
- Estimate person count from the signal
- Generate distinct skeletons with different positions/poses
Proposed Solution (ADR-037)
Phase 1: Person Count Estimation
- Use CSI signal variance, eigenvalue spread, or spectral complexity to estimate the number of people
- Threshold-based approach: motion energy above N sigma suggests multiple occupants
- Frequency-domain decomposition: distinct motion frequencies indicate separate individuals
Phase 2: Signal Decomposition
- ICA (Independent Component Analysis): Decompose CSI subcarrier matrix into independent source signals
- NMF (Non-negative Matrix Factorization): Separate additive contributions from multiple scatterers
- Clustering: Group subcarrier responses by spatial coherence to identify distinct reflectors
Phase 3: Multi-Skeleton Generation
- Map each decomposed signal component to a separate skeleton
- Use spatial diversity from subcarrier phase to estimate relative positions
- Kalman tracking per person with ID assignment via AETHER re-ID embeddings (ADR-024)
Phase 4: Neural Model Enhancement
- Train multi-person model on MM-Fi dataset (ADR-015) which includes multi-person scenarios
- Use the RVF training pipeline (ADR-036) to fine-tune with recorded CSI data
- LoRA profile for multi-person specialization
Affected Components
| Component | Change Required |
|---|---|
sensing-server/src/main.rs |
derive_pose_from_sensing() — multi-person output |
signal/src/ruvsense/field_model.rs |
SVD eigenstructure for person count estimation |
signal/src/ruvsense/pose_tracker.rs |
Multi-target Kalman tracking |
ruvector/src/viewpoint/fusion.rs |
Multi-person fusion from multistatic array |
nn/ |
Multi-person inference head |
ui/components/PoseDetectionCanvas.js |
Already supports multi-person rendering |
ui/utils/pose-renderer.js |
Already iterates over persons[] array |
Constraints
- Single ESP32 node provides 1 TX × 1 RX × 56 subcarriers — limited spatial resolution
- Multi-person separation improves significantly with multiple ESP32 nodes (multistatic mesh, ADR-029)
- Signal-derived approach will have lower accuracy than neural model approach
- Person count estimation ceiling: ~3-4 people for single-node, ~8+ for mesh
Acceptance Criteria
- Person count estimation from CSI features (accuracy > 80% for 1-3 people)
-
derive_pose_from_sensing()returns multiple persons when detected - Each person has distinct position and keypoint coordinates
- Kalman tracking maintains person IDs across frames
- UI renders multiple skeletons simultaneously
- ADR-037 documents the approach and trade-offs
Related
- ADR-024: AETHER contrastive embeddings (re-ID)
- ADR-029: RuvSense multistatic sensing mode
- ADR-036: RVF training pipeline & UI
- PR fix: WebSocket race condition, data source indicators, auto-start pose detection #96: WebSocket fix, data source indicators, auto-start
signal/src/ruvsense/field_model.rs: SVD room eigenstructuresignal/src/ruvsense/pose_tracker.rs: Kalman tracker with re-ID
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request