Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get convex masks from sam2 predictor using bounding box prompts? #455

Open
ovalerio opened this issue Nov 19, 2024 · 1 comment
Open

Comments

@ovalerio
Copy link

Hello SAM2 Team,

Thank you for making SAM2 available. It is an amazing piece of software. I am currently using the model to track a worm head. SAM2 is helping me to seed the masks for my custom segmentation model. Unfortunately my images are a little unsharp so I am not getting convex masks that I can later use for training a custom network. I think sharing an image would explain it better.

image

Do you have any suggestions on getting convex binary masks from SAM2 that I can use for my pipeline?

Thanks again!

@heyoeyo
Copy link

heyoeyo commented Nov 19, 2024

I don't know that there's any way to get SAM to give convex polygons, however it would be fairly straightforward to do this using more conventional (i.e. not AI) image processing. OpenCV has built in functions that make this easy, the steps would be something like:

  1. Convert the SAM prediction to a binary mask (in numpy)
  2. Use cv2.findContours to get polygons from the mask
  3. Use cv2.convexHull to generate a convex hull from each polygon
  4. Use cv2.fillConvexPoly to draw convex hulls onto a blank image to produce the final mask

From the code snippet you posted, this would maybe look like:

import cv2 # Requires opencv to be installed!
import numpy as np

mask_uint8 = ((out_mask_logits[0] > 0.0).byte() * 255).cpu().numpy()
contours, _ = cv2.findContours(mask_uint8, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
final_mask_uint8 = np.zeros_like(mask_uint8)
for c in contours:
  hull = cv2.convexHull(c)
  cv2.fillConvexPoly(final_mask_uint8, hull, 255)

This is assuming out_mask_logits[0] is just a single-channel mask (i.e. has shape: HxW). If it has multiple channels (i.e. the multi-mask predictions) then you may need to process each mask separately, since the opencv functions probably won't handle the multi-channel mask properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants