Skip to content

SAM2Translator only returns one mask instead of the full list of candidates #3794

@garthhenning

Description

@garthhenning

Description

ai.djl.modality.cv.translator.SAM2Translator only returns one mask instead of the full list of candidates.

Expected Behavior

The returned DetectedObjects object should contain a list of all masks detected if the input argument multimask_output is set to true. The current behavior to return the "best" mask is correct if multimask_output is set to false.

Details

Starting from line 160 of the SAM2Translator source code, it is clear that all masks are retrieved, but in line 168 the result is built as a singleton list, then that singleton list is returned to the user in line 170 with no pathway to returning the full set of masks.

160 NDArray masks = logits.gt(0f).squeeze(0);
161
162 float[][] dist = Mask.toMask(masks.get(best).toType(DataType.FLOAT32, true));
163 Mask mask = new Mask(0, 0, width, height, dist, true);
164 double probability = scores.getFloat(best);
165
166 List<String> classes = Collections.singletonList("");
167 List<Double> probabilities = Collections.singletonList(probability);
168 List<BoundingBox> boxes = Collections.singletonList(mask);
169
170 return new DetectedObjects(classes, probabilities, boxes);

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions