UX Enhancements: ROI box, visual preview, multi-prompting, and custom mask naming by psewdgb · Pull Request #8 · AyedaOk/sam3-tools

psewdgb · 2026-03-11T01:17:57Z

Hi again! As discussed, here is the PR containing the workflow optimizations for both segmentation scripts.

These changes aim to give the user much more visual control and precision before saving the .pfm files to Darktable, preventing "blind" generations and saving a lot of time.

Here is the detailed list of the added features:

point_segmentation.py (Box + Points workflow)

Added a 2-step process: Introduced cv2.selectROI at launch. The user can now draw a strict bounding box to physically isolate an object (e.g., separating skin from clothes) before placing positive/negative points.

Pressing Space validates the box (or skips it if drawn empty). The coordinates are passed to the input_boxes parameter of SAM3 to restrict its mathematical attention.

text_segmentation.py (Preview & Multi-targets)

Visual Preview: Added a cv2 window displaying the generated mask in red over the image. The user can visually check the result and press Enter to approve, or Esc to cancel the process entirely (preventing bad .pfm generation in Darktable).

Multi-Prompting (Comma separated): The script now splits the text input by commas (e.g., skin, armor, sword). It queries SAM3 for each word individually and merges the results into a single, unified mask. This is extremely useful for creating a global subject mask (to apply background blur/Orton effects in Darktable).

Global QoL / File Management (Both scripts)

Custom Naming via Tkinter: Upon validation, a native Tkinter pop-up asks the user for a custom tag (e.g., face, sword).

Smart Filenames: If a tag is provided, the filename is simplified and readable (e.g., image_face_193951_mask.pfm). If left blank, it defaults to the original full datetime format.

Technical note: Added a cv2.waitKey(1) before the Tkinter pop-up as a workaround for a known Windows bug to ensure OpenCV flushes memory properly and prevents UI freezing.

I developed and tested all of this on Windows. Since I used native Python libraries (tkinter) and cv2, it should be relatively cross-platform, but let me know if it requires any adjustments for Mac/Linux!

Thanks again for the amazing base code!

Optimisation scripts text and point

cf829a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UX Enhancements: ROI box, visual preview, multi-prompting, and custom mask naming#8

UX Enhancements: ROI box, visual preview, multi-prompting, and custom mask naming#8
psewdgb wants to merge 1 commit intoAyedaOk:mainfrom
psewdgb:main

psewdgb commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

psewdgb commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant