Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine grained SAM #9

Open
dlangenk opened this issue Aug 1, 2023 · 6 comments
Open

Fine grained SAM #9

dlangenk opened this issue Aug 1, 2023 · 6 comments
Labels

Comments

@dlangenk
Copy link
Member

dlangenk commented Aug 1, 2023

Big images and small objects don't mix well. SAM is using 1024x1024 pixels as input size. If you have a 6000x8000px image small objects are hard to detect. A way around this is computing an embedding on only of part of the image. However, i think this would call for computing the embedding on the client side on demand or we need to think about an embedding computation service that is faster.

@dlangenk dlangenk added the enhancement New feature or request label Aug 1, 2023
@dlangenk
Copy link
Member Author

dlangenk commented Aug 5, 2023

In the torchserve branch there is already an implementation for getting an embedding of a crop around a point x,y

@mzur
Copy link
Member

mzur commented Aug 7, 2023

I'm very reluctant to introduce an architectural change (i.e. a torchserve service) for this. BIIGLE/Laravel is designed to use queued jobs for this kind of task. These also scale much better (i.e. to multiple GPU machines). Cropped embeddings could also be implemented with a queued job and storage of the embedding file but since the embedding can (probably) only be used a single time, it's a waste of storage space. This needs more thinking.

@dlangenk
Copy link
Member Author

dlangenk commented Aug 7, 2023

The problem is that you do not want to wait for a cropped embedding for more than a second usually. This is only possible if

  • we precompute the embedding -> waste of memory and a very large number of possible embeddings
  • or we have a service running all the time that has the model already loaded (whatever that service is: flask server, torchserve, NVIDIA Triton, ...)

Do we really need to scale to multiple GPU machines for an inference job which usually doesn't take more than a few seconds? Other people must have this issues too. Probably we can look for a solution there.

@mzur
Copy link
Member

mzur commented Aug 7, 2023

I'm not against it per se. It just raises the complexity of the issue from "I might implement this if I have half a day of free time" to "I might think about it a little more if I have half a day of free time" 😉

I want to avoid a solution that is too specific. For example, now we have "slow" GPU workers and "fast" GPU workers. The slow ones are used by MAIA and the fast ones by SAM. These could also be used by any other module that needs GPU processing. A potential torchserve service should also be generic enough that it is not limited to SAM but could also run other stuff. Otherwise, we need one GPU for each new algorithm that we want to support.

@mzur mzur added discuss and removed enhancement New feature or request labels Aug 7, 2023
@dlangenk
Copy link
Member Author

dlangenk commented Aug 9, 2023

The fine grained SAM could also make SAM available for Mosaics.

@mzur mzur added the MI3 label Oct 18, 2023
@mzur
Copy link
Member

mzur commented Mar 20, 2024

While this would only be a workaround and also no solution for tiled images, FeatUp could improve the segmentation resolution (maybe) without having to modify the existing code much.

@mzur mzur moved this to Medium Priority in BIIGLE Roadmap Jun 11, 2024
@mzur mzur mentioned this issue Jun 11, 2024
@mzur mzur removed the MI3 label Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Medium Priority
Development

No branches or pull requests

2 participants