Issues with Cellpose on macOS using the MPS Backend #1063

Vijayishwerj · 2024-11-23T17:54:36Z

Hello Cellpose Team,

I am experiencing several issues while using Cellpose for segmentation and training models on my Mac Studio M2 Ultra. I would appreciate your guidance to resolve these problems. Below are the details:

Environment Information

•	Operating System: macOS
•	Device: Mac Studio M2 Ultra
•	Python Version: 3.11.10
•	Cellpose Version: 3.1.0
•	Torch Version: 2.5.1
•	Backend: MPS (Metal Performance Shaders)

Issues Faced

1.	GPU Incompatibility with Sparse Tensor Operations:
•	Error: NotImplementedError: Could not run 'aten::_sparse_coo_tensor_with_dims_and_tensors' with arguments from the 'SparseMPS' backend....
•	This suggests that the MPS backend lacks support for sparse tensor operations, leading to failures during segmentation.
2.	Fallback to CPU:
•	When the MPS backend fails, computation falls back to the CPU.
•	However, warnings about the missing MKL optimizations slow down performance significantly:  WARNING: MKL version on torch not working/installed - CPU version will be slightly slower.


3.	GPU Training Issues:
•	The latest version of Cellpose mandates GPU use for training. However, the MPS backend does not complete tasks, and training without a GPU seems impossible.
4.	Performance Bottleneck:
•	The fallback to CPU leads to extremely long training times, making it impractical for large datasets.

Run Logs

Attached below is the terminal output with verbose mode enabled, showing the errors and relevant information:
• MPS backend available: torch.backends.mps.is_available() returns True.
• The operation fails during sparse tensor computation (aten::_sparse_coo_tensor_with_dims_and_tensors).

NotImplementedError: Could not run 'aten::_sparse_coo_tensor_with_dims_and_tensors' with arguments from the 'SparseMPS' backend. ####

1.	Verified that the MPS backend is correctly installed and available.
2.	Updated to the latest versions of Cellpose, Python, and PyTorch.
3.	Attempted fallback to the CPU but encountered performance issues due to lack of MKL support.
4.	Explored replacing sparse operations with dense ones but encountered compatibility constraints.

Request for Assistance

1.	GPU Support:
•	Are there plans to improve sparse tensor compatibility for the MPS backend in future versions?

The text was updated successfully, but these errors were encountered:

sophiamaedler · 2024-12-05T10:46:12Z

I've looked into this a little and there seem to be two breaking changes that prevent running cellpose >= 3.1 on an MPS backend:

currently no PyTorch support for sparse operations (see here: MPS Sparse Support pytorch/pytorch#129842)
apple GPUs are only single-precision so they do not and probably will never support torch.double/torch.float64 operations.

Regarding 1:
I would hope that at some point PyTorch will release MPS support for sparse operations. Until then I am not sure how much work it would be to implement some workaround for MPS or if the solution would be to limit MPS use to cellpose 3.0?

Regarding 2:
I found a few occurrences of torch.float64 and torch.double:
torch.double: cellpose/dynamics.py
torch.float64: cellpose/dynamics.py

I guess it would be fairly straightforward to add checks here for the MPS backend and ensure that at most torch.float32 is used. I am not sure if this would have any impact on the generated results though. Maybe @carsen-stringer could comment on if replacing occurrences of torch.double/torch.float64 would have any negative consequences . If not I'd be happy to make a PR.

In addition the switch from
mu /= (1e-20 + (mu**2).sum(axis=0)**0.5)
to
mu /= (1e-60 + (mu**2).sum(axis=0)**0.5)
results in RuntimeWarnings on MacOs:

"RuntimeWarning: invalid value encountered in divide
mu /= (1e-60 + (mu**2).sum(axis=0)**0.5)"

My guess would be that this is a direct result of processes running on an MPS backend using float32 and not float64 which results in zero-like values being created. Was there a concrete rational for implementing this switch? What would the effects of leaving it at 1e-20 vs ignoring the warning?

OratHelm · 2024-12-06T12:42:39Z

I think this problem has already been mentioned here: #1034.
A quick workaround for single-precision has been added in version 3.0.11 of Cellpose, which allows compatibility with the MPS backend. But it can certainly be improved!
Sparse operations have been added in subsequent versions of Cellpose. While waiting for a fix to avoid using them with Apple GPUs or for pytorch to implement these functions for MPS, you can go back to version 3.0.11: pip install git+https://github.com/mouseland/[email protected]

Vijayishwerj added the install install help label Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with Cellpose on macOS using the MPS Backend #1063

Issues with Cellpose on macOS using the MPS Backend #1063

Vijayishwerj commented Nov 23, 2024

sophiamaedler commented Dec 5, 2024 •

edited

Loading

OratHelm commented Dec 6, 2024 •

edited

Loading

Issues with Cellpose on macOS using the MPS Backend #1063

Issues with Cellpose on macOS using the MPS Backend #1063

Comments

Vijayishwerj commented Nov 23, 2024

sophiamaedler commented Dec 5, 2024 • edited Loading

OratHelm commented Dec 6, 2024 • edited Loading

sophiamaedler commented Dec 5, 2024 •

edited

Loading

OratHelm commented Dec 6, 2024 •

edited

Loading