Skip to content

Refactor valve detection to use OBB for angle prediction#17

Open
jenscaa wants to merge 28 commits intomainfrom
add-yolov26obb-valve-prediction
Open

Refactor valve detection to use OBB for angle prediction#17
jenscaa wants to merge 28 commits intomainfrom
add-yolov26obb-valve-prediction

Conversation

@jenscaa
Copy link

@jenscaa jenscaa commented Mar 18, 2026

Summary

  • Splits the monolithic valve_detection.cpp into three focused files:
    depth_image_processing, pose_estimator, and valve_pose_ros
  • Replaces the old ValveDetector class with PoseEstimator (plane/normal estimation only)
  • Moves depth-to-color pixel projection into depth_image_processing as a free function
    using the full intrinsic + extrinsic pipeline, replacing the previous 1D x-offset approximation
  • Fixes sign error in d555_depth_to_color_extrinsic translation vector
  • Adds YOLO OBB (oriented bounding box) support with vision_msgs/Detection2DArray
  • Camera data (intrinsics, extrinsic) is now owned by the ROS node and passed explicitly

vortexuser and others added 25 commits February 14, 2025 19:09
…d x vector chosen to be the vector 90 degrees on z vector and rotated by angle about z vector
…orks well in good ligting, but when it is covered by a shadow the edge detection does not detect shit roflcopter
@jenscaa jenscaa requested review from jorgenfj and kluge7 March 18, 2026 19:53
@jenscaa jenscaa self-assigned this Mar 18, 2026
@kluge7 kluge7 mentioned this pull request Mar 19, 2026
…s, adjust hook stages, and add static analysis tools
Copy link
Contributor

@kluge7 kluge7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of reviews. Mainly improve docs and remove redundant code

detections_sub_topic: "/detections_output"
# ── Inputs ────────────────────────────────────────────────────────────────
depth_image_sub_topic: "/realsense/D555_409122300281_Depth"
color_image_sub_topic: "/realsense/D555_409122300281_Color"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for? Doesnt the code only need the depth image and detections?

Comment on lines +12 to +35
# Color camera intrinsics (always used – not overridden by camera_info)
# Source: /camera/camera/color/camera_info
color_fx: 452.121521
color_fy: 451.737976
color_cx: 438.254059
color_cy: 248.703217
color_image_width: 896
color_image_height: 504

# Distortion coefficients [k1, k2, p1, p2, k3] (plumb_bob model)
color_d1: -0.0548105500638485
color_d2: 0.0597584918141365
color_d3: 0.000486430013552308
color_d4: 0.00031599000794813
color_d5: -0.0192314591258764

# Depth camera intrinsics (fallback when depth camera_info is not published)
# Source: /camera/camera/depth/camera_info
depth_fx: 449.449707
depth_fy: 449.449707
depth_cx: 444.819946
depth_cy: 248.226578
depth_image_width: 896
depth_image_height: 504
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the code should subscribe to a publisher sending out a CameraInfo topic, instead of this being hardcoded here. Currently we dont have any rosbags for testing this yet.

Comment on lines +38 to +39
valve_pose_stamped_pub_topic: "/valve_pose"
valve_poses_pub_topic: "/valve_poses"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are mainly for debugging and visualization in Foxglove, ideally we should be able to toggle this. This would require adding another param for if we want to visualize, and then in the code you use IF-statements to only publish if we are debugging

valve_pose_stamped_pub_topic: "/valve_pose"
valve_poses_pub_topic: "/valve_poses"
landmarks_pub_topic: "/valve_landmarks"
processed_image_pub_topic: "/valve_detection_image"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valve_poses_pub_topic: "/valve_poses"
landmarks_pub_topic: "/valve_landmarks"
processed_image_pub_topic: "/valve_detection_image"
depth_colormap_pub_topic: "/valve_detection_depth_colormap"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can either be removed or added as a debugging topic (see other comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main feedback is add inline comments explaining the different operations inside the functions

Comment on lines +123 to +124
if (!coefficients || coefficients->values.size() < 4)
return Eigen::Vector3f::Zero();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 4? Add comment explaining

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again mainly add inline comments explaining

find_package(PCL REQUIRED)
find_package(pcl_conversions REQUIRED)

# <<< IMPORTANT >>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpt

<maintainer email="oleagaton@gmail.com">ole</maintainer>
<version>0.1.0</version>
<description>Valve pose estimation from YOLOv26 OBB detections + depth image.</description>
<maintainer email="you@your.org">you</maintainer>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add email

Comment on lines +342 to +343
publish_pose_array({}, depth->header);
publish_landmarks({}, depth->header);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Comment on lines +176 to +190
BoundingBox ValvePoseNode::undistort_bbox(const BoundingBox& bbox) const
{
if (cv_camera_matrix_.empty() || cv_dist_coeffs_.empty()) return bbox;

std::vector<cv::Point2f> pts{{bbox.center_x, bbox.center_y}};
std::vector<cv::Point2f> undistorted;
cv::undistortPoints(pts, undistorted,
cv_camera_matrix_, cv_dist_coeffs_,
cv::noArray(), cv_camera_matrix_);

BoundingBox result = bbox;
result.center_x = undistorted[0].x;
result.center_y = undistorted[0].y;
return result;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should undistort entire bbox, not just the center.

Comment on lines +386 to +387
if (detector_->compute_pose_from_depth(
depth_img, yolo_box, pose, ann_dbg, pln_dbg, true)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes more sense to create a struct containing:

result
result_valid

instead of having the function return a bool

Comment on lines +470 to +496
// Publish valve center positions as a PointCloud2 for Foxglove 3D view.
if (valve_points_pub_->get_subscription_count() > 0) {
pcl::PointCloud<pcl::PointXYZ> pts;
for (const auto& p : poses) {
pts.points.push_back({p.position.x(), p.position.y(), p.position.z()});
}
pts.width = static_cast<uint32_t>(pts.points.size());
pts.height = 1;
pts.is_dense = true;
sensor_msgs::msg::PointCloud2 pc_msg;
pcl::toROSMsg(pts, pc_msg);
pc_msg.header = pose_header;
valve_points_pub_->publish(pc_msg);
}

// Publish best (first) detection as PoseStamped for easy Foxglove 3D view.
if (!poses.empty()) {
geometry_msgs::msg::PoseStamped ps;
ps.header = pose_header;
ps.pose.position.x = poses[0].position.x();
ps.pose.position.y = poses[0].position.y();
ps.pose.position.z = poses[0].position.z();
ps.pose.orientation.x = poses[0].orientation.x();
ps.pose.orientation.y = poses[0].orientation.y();
ps.pose.orientation.z = poses[0].orientation.z();
ps.pose.orientation.w = poses[0].orientation.w();
pose_stamped_pub_->publish(ps);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need both of these?
Wont a PoseStampedArray give you both 3d position and orientation in the foxglove 3d view

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this file does too much and its functions are too large.
split param decleration and node setup into different functions.

Also some functions in this file are not related to ros and can be moved out.

Also consider creating a small ros utils file for most of the message conversions and debug publishing so that the main file becomes less cluttered and easier to follow

Comment on lines +86 to +99
DepthColorExtrinsic d555_depth_to_color_extrinsic() {
DepthColorExtrinsic extr;
// Both depth and color optical frames share the same body-to-optical
// rotation, so R is identity.
extr.R = Eigen::Matrix3f::Identity();
// t = depth origin expressed in color optical frame.
// Color is 0.059 m in -y (rightward) of depth in body frame, so depth is
// +0.059 m in y (leftward) of color. Converting to optical frame:
// t_optical = R_body_to_optical * [0, +0.059, 0]
// = [[0,-1,0],[0,0,-1],[1,0,0]] * [0, 0.059, 0]
// = [-0.059, 0, 0]
extr.t = Eigen::Vector3f(-0.059f, 0.0f, 0.0f);
return extr;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding is fine in this case but an atleast set these in the config

@kluge7
Copy link
Contributor

kluge7 commented Mar 19, 2026

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants