An overhaul of the CellStitch algorithm, developed by Yining Liu and Yinuo Jin (original repository), publication can be found here.
Some major adjustments:
- Replaced NumPy with CuPy for GPU-accelerated calculations.
- Replaced nested for-loops with vectorized calculations for dramatic speedups (~100x).
- Included novel segmentation method InstanSeg, which enables multichannel inputs (repo and publication).
- An all-in-one method that takes an ZCYX-formatted .tif file, performs the correct transposes, and writes stitched labels.
- Included a histogram-based bleach correction to adjust for signal degradation over the Z-axis (originally developed for ImageJ in (Miura 2020) and released for Python by marx-alex in napari-bleach-correct).
- Completely rewrote the interpolation pipeline to obtain equal results with less RAM usage and at a much higher speed.
The calculations were run on the same machine (GPU: NVIDIA Quadro RTX 6000 24 GB; CPU: Intel Xeon Gold 6252 (48/96
cores); RAM: 1024 GB), the core count of which gave it a clear parallel-processing advantage. This particularly
affects the fill_holes_and_remove_small_masks
function, which has been rewritten to utilize parallel processing.
In image A, GPU VRAM load was ~200 MB at its peak.
In image B, GPU VRAM load was ~2442 MB at its peak
The revised interpolation method leverages a more efficient alternative to SciPy's binary_fill_holes
, speeding up
the process tremendously (>100x)
Image: 10x1024x1024 px containing 4117 stitched masks
Metric | Original | Revised |
---|---|---|
Time (s) | 3356.18 | 26.31 |
RAM (GB) | ~60 | ~16 |
This setup has so far only been verified on Windows-based, CUDA-accelerated machines. Testing has only been performed on CUDA 12.x. There are no reasons why 11.x should not work (check instructions), but your mileage may vary.
conda create -n cellstitch-cuda python=3.11
conda activate cellstitch-cuda
pip install cellstitch-cuda
pip uninstall torch
conda install pytorch pytorch-cuda=12.4 -c conda-forge -c pytorch -c nvidia
You may replace the version number for pytorch-cuda
with whatever is applicable for you.
pip uninstall cupy-cuda12x
pip install cupy-cuda11x
For more detail, see examples/
.
This assumes a multichannel grayscale image in the order ZCYX. Single-channel images are currently not supported, but will be in the future.
from cellstitch_cuda.pipeline import cellstitch_cuda
img = "path/to/image.tif" # ZCYX
# or feed img as a numpy ndarray
volumetric_masks = cellstitch_cuda(img)
These are label images over the Z-, X, and Y-axis. They are assumed to be in the order ZYX. If you set
output_masks=True
in the cellstitch_cuda()
-function, these masks will be written to disk (either in the input
folder, or in the folder set in output_path
).
from cellstitch_cuda.pipeline import full_stitch
import tifffile
# Define xy_masks, yz_masks, xz_masks
yx_masks = tifffile.imread("path/to/yx_masks.tif") # ZYX
yz_masks = tifffile.imread("path/to/yz_masks.tif") # ZYX
xz_masks = tifffile.imread("path/to/xz_masks.tif") # ZYX
volumetric_masks = full_stitch(yx_masks, yz_masks, xz_masks)
cellstitch_cuda()
takes the following arguments:
- img: Either a path pointing to an existing image, or a numpy.ndarray. Must be 4D (ZCYX).
- output_masks: True to write all masks to the output path, or False to only return the final stitched mask. Default False
- output_path: Set to None to write to the input file location (if provided). Ignored of
output_masks
is False. N.B.: Ifoutput_masks
is True, while no path has been provided (e.g., by loading a numpy.ndarray directly), the output masks will be written to the folder where the script is run from. Default None - seg_mode: Instanseg segmentation mode: "nuclei" to only return nuclear masks, "cells" to return all the cell masks (including those without nuclei), or "nuclei_cells", which returns only cells with detected nuclei. Default "nuclei_cells"
- pixel_size: XY pixel size in microns per pixel. When set to None, will be read from img metadata if possible. Default None
- z_step: Z pixel size (z step) in microns per step. When set to None, will be read from img metadata if possible. Default None
- bleach_correct: Whether histogram-based signal degradation correction should be applied to
img
. Default True - filtering: Whether the optimized
fill_holes_and_remove_small_masks
function should be executed. Default True - interpolation: If set to True, the function returns a tuple of the array of stitched masks and an array with
interpolated volumetric masks. CellStitch provides an interpolation method to turn anisotropic masks into
pseudo-isotropic masks. The algorithm, adapted from the original codebase, has been completely rewritten for
efficient parallel processing. Outputs a separate mask in the output folder if
output_masks
= True. Default False - n_jobs: Set the number of threads to be used in parallel processing tasks. Use 1 for debugging. Generally, best left at the default value. Default -1
- verbose: Verbosity. Default False
full_stitch()
takes the following arguments:
- xy_masks_prior: numpy.ndarray with XY masks, order ZYX
- yz_masks: numpy.ndarray with YZ masks, order ZYX
- xz_masks: numpy.ndarray with XZ masks, order ZYX
- nuclei: numpy.ndarray with XY masks of nuclei, order ZYX. If provided, it will run the function
filter_nuclei_cells()
to filter volumetric masks by the presence of a 2D nucleus mask. Default None - filter: Use CellPose-based fill_holes_and_remove_small_masks() function. Default True
- n_jobs: Number of threads used. Set n_jobs to 1 for debugging parallel processing tasks. Default -1
- verbose: Verbosity. Default False
full_interpolate()
takes the following arguments:
- masks: numpy.ndarray with stitched XY masks
- anisotropy: The ratio (or mismatch) between the Z and XY sampling rate, calculated as
anisotropy = z_step/pixel_size
. Default 2 - dist: The distance metric used to calculate the Optimal Transport between two masks. Default "sqeuclidean"
- n_jobs: Number of threads used. Set n_jobs to 1 for debugging parallel processing tasks. Default -1
- verbose: Verbosity. Default False
Goldsborough, T., O’Callaghan, A., Inglis, F., Leplat, L., Filbey, A., Bilen, H., & Bankhead, P. (2024) A novel channel invariant architecture for the segmentation of cells and nuclei in multiplexed images using InstanSeg. bioRxiv, 2024.09.04.611150. doi: 10.1101/2024.09.04.611150
Liu, Y., Jin, Y., Azizi, E., & Blumberg, E. (2023) Cellstitch: 3D cellular anisotropic image segmentation via optimal transport. BMC Bioinformatics, 24(480). doi: 10.1186/s12859-023-05608-2
Miura, K. (2020) Bleach correction ImageJ plugin for compensating the photobleaching of time-lapse sequences. F1000Res, 9:1494. doi: 10.12688/f1000research.27171.1
Stringer, C., Wang, T., Michaelos, M., & Pachitariu, M. (2021) Cellpose: a generalist algorithm for cellular segmentation. Nature Methods, 18(1), 100-106. doi: 10.1038/s41592-020-01018-x