Skip to content

Commit

Permalink
update to v2.0
Browse files Browse the repository at this point in the history
  • Loading branch information
braysia committed Mar 22, 2019
2 parents 3d12a2f + 4cfa844 commit 6834cda
Show file tree
Hide file tree
Showing 43 changed files with 1,310 additions and 126 deletions.
4 changes: 4 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ RUN conda install matplotlib==1.5.1 \
pypng==0.0.18 mahotas==1.4.1 opencv-python==3.2.0.7 \
git+https://github.com/jfrelinger/cython-munkres-wrapper \
jupyter
RUN pip install numba notebook==5.4.1
RUN pip install fast-histogram
RUN pip install keras==2.0.0 tensorflow==1.8.0


EXPOSE 8888
WORKDIR /home
Expand Down
55 changes: 39 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# CellTK

Live-cell analysis toolkit.
For the active development version with more functionality, please visit [here](https://github.com/braysia/CellTK).
Live-cell analysis toolkit.
#### For the active development version, please visit [here](https://github.com/braysia/CellTK).
#### v0.4 is used in [Nature Protocols](https://www.ncbi.nlm.nih.gov/pubmed/29266096) paper.

Image processing is simply an image conversion/transformation process.

Image processing is simply an image conversion/transformation process.
CellTK has the following five major processes which all implement conversion between img and labels.

1. preprocessing: img -> img
Expand All @@ -12,22 +14,21 @@ CellTK has the following five major processes which all implement conversion bet
4. tracking: labels -> labels*
5. postprocessing: labels -> labels*

where
where
- img: np.ndarray[np.float32] (e.g. a raw image from a microscope)
- labels: np.ndarray[np.int16] (e.g. nuclear objects)
- labels: np.ndarray[np.int16] (e.g. nuclear objects)
\* tracked objects have consistent values over frames

For each processes, you can find a module named ___\*\_operation.py___. (e.g. _celltk/preprocess_operations.py_).

For each processes, you can find a module named ___\*\_operation.py___. (e.g. _celltk/preprocess_operations.py_).

These files are the "repositories" of functions.
These files are the "repositories" of functions.
They simply contain a list of functions which takes an input and convert images. If you need a new function, simply add it to here.


When you input a raw image, it should take TIFF or PNG files with various datatypes as well.

### Command line Example:
The simplest way to apply a function is to use ___command.py___.
The simplest way to apply a function is to use ___command.py___.
This option is convenient to play with functions and parameters.


Expand All @@ -36,8 +37,8 @@ python celltk/command.py -i data/testimages0/CFP/img* -f constant_thres -p THRES
python celltk/command.py -i data/testimages0/CFP/img* -l output/c1/img* -f run_lap track_neck_cut -o output/nuc
```

___-i___ for images path, ___-l___ for labels path, ___-o___ for an output directory, ___-f___ for a function name from ___*operation.py___ modules, ___-p___ for arguments to the function.
___-i___ for images path, ___-l___ for labels path, ___-o___ for an output directory, ___-f___ for a function name from ___*operation.py___ modules, ___-p___ for arguments to the function.

Note that, time-lapse files need to have file names in a sorted order.


Expand All @@ -60,14 +61,14 @@ This configuration file contains operations defined like this:
You can find how to set up a configuration file [here](doc/CONFIGURE_YML.md).

### Apply to extract single-cell properties
After segmenting and tracking cells, we want to extract single-cell properties as a table.
After segmenting and tracking cells, we want to extract single-cell properties as a table.

Unlike other five major processes, ___celltk/apply.py___ produces __csv__ and __npz__ file as an output.

```
python celltk/apply.py -i data/testimages0/CFP/img* -l output/nuc/img* -o output/array.npz
```
By default, it will use a folder name as a table key.
By default, it will use a folder name as a table key.
To specify table keys, use ___-p___ and ___-s___ in a command line.
```
python celltk/apply.py -i data/testimages0/YFP/img* -l output/nuc/img* -o output/array.npz -p nuc -s YFP
Expand All @@ -92,7 +93,7 @@ Or use ___obj\_names___ and ___ch\_names___ in a caller.
```


The output can be loaded with LabeledArray class.
The output can be loaded with LabeledArray class.
e.g.
```
python -c "from celltk.labeledarray import LabeledArray;arr = LabeledArray().load('output/array.npz');print arr.labels;print arr['CFP', 'nuc', 'x']"
Expand All @@ -101,7 +102,29 @@ python -c "from celltk.labeledarray import LabeledArray;arr = LabeledArray().loa
For visualization and manipulation of these arrays, I recommend to take a loot at [covertrace](https://github.com/braysia/covertrace).


## Running Docker Container
## Install dependencies


If you do not need a dev version, simply
```
pip install celltk
```
This will register `celltk` command, where you can pass input file like `celltk input_file/input_tests1.yml`.
___________________________

It is compatible with [poetry](https://github.com/sdispater/poetry).
```
git clone https://github.com/braysia/CellTK.git & cd CellTK
pip install poetry
poetry install
```
Install the additional package may speed up computation.
```
pip install git+https://github.com/jfrelinger/cython-munkres-wrapper
```
_________

The other option is to use Docker container.
```
docker pull braysia/celltk
docker run -it -v /$FOLDER_TO_MOUNT:/home/ braysia/celltk
Expand Down
2 changes: 1 addition & 1 deletion setup.py → _setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

setup(
name="celltk",
version="0.4",
version="2.0",
packages=find_packages(),
author='Takamasa Kudo',
author_email='[email protected]',
Expand Down
58 changes: 39 additions & 19 deletions celltk/apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,27 +22,33 @@
from scipy.ndimage.morphology import binary_fill_holes
from scipy.ndimage.morphology import binary_dilation

import warnings
warnings.filterwarnings('ignore', category=np.VisibleDeprecationWarning)
warnings.filterwarnings('ignore', category=pd.io.pytables.PerformanceWarning)

logger = logging.getLogger(__name__)


PROP_SAVE = ['area', 'cell_id', 'convex_area', 'cv_intensity',
'eccentricity', 'major_axis_length', 'minor_axis_length', 'max_intensity',
'mean_intensity', 'median_intensity', 'min_intensity', 'orientation',
'perimeter', 'solidity', 'std_intensity', 'total_intensity', 'x', 'y', 'parent', 'num_seg']
MAX_NUMCELL = 100000


def find_all_children(labels):

mask = binary_fill_holes(labels < 0)
mask[labels < 0] = False
return np.unique(labels[mask]).tolist()
clabelnums = np.unique(labels[mask]).tolist()
if 0 in clabelnums:
clabelnums.remove(0)
return clabelnums


def find_parent_label(labels, child_label):
mask = binary_dilation(labels == child_label)
mask[labels == child_label] = False
assert len(np.unique(labels[mask])) == 1
return labels[mask][0]
return max(set(labels[mask].tolist()), key=labels[mask].tolist().count)


def add_parent(cells, labels):
Expand All @@ -55,9 +61,6 @@ def add_parent(cells, labels):
return cells



# def add_parent_id(labels, img, cells):
# return cells
def apply():
pass

Expand Down Expand Up @@ -85,15 +88,22 @@ def df2larr(df):
return larr


def multi_index(cells, obj_name, ch_name):
frames = np.unique([i.frame for i in cells])
index = pd.MultiIndex.from_product([obj_name, ch_name, PROP_SAVE, frames], names=['object', 'ch', 'prop', 'frame'])
column_idx = pd.MultiIndex.from_product([np.unique([i.cell_id for i in cells])])
df = pd.DataFrame(index=index, columns=column_idx, dtype=np.float32)
for cell in cells:
for k in PROP_SAVE:
df[cell.cell_id].loc[obj_name, ch_name, k, cell.frame] = np.float32(getattr(cell, k))
return df
def _cells2array(cells):
arr = np.zeros((len(cells), len(PROP_SAVE)), np.float32)
for cnum, cell in enumerate(cells):
arr[cnum, :] = [getattr(cell, k) for k in PROP_SAVE]
return arr


# def multi_index(cells, obj_name, ch_name):
# frames = np.unique([i.frame for i in cells])
# index = pd.MultiIndex.from_product([obj_name, ch_name, PROP_SAVE, frames], names=['object', 'ch', 'prop', 'frame'])
# column_idx = pd.MultiIndex.from_product([np.unique([i.cell_id for i in cells])])
# df = pd.DataFrame(index=index, columns=column_idx, dtype=np.float32)
# for cell in cells:
# for k in PROP_SAVE:
# df[cell.cell_id].loc[obj_name, ch_name, k, cell.frame] = np.float32(getattr(cell, k))
# return df


def caller(inputs_list, inputs_labels_list, output, primary, secondary):
Expand All @@ -105,21 +115,31 @@ def caller(inputs_list, inputs_labels_list, output, primary, secondary):
obj_names = [basename(dirname(i[0])) for i in inputs_labels_list] if primary is None else primary
ch_names = [basename(dirname(i[0])) for i in inputs_list] if secondary is None else secondary

store = []
for inputs, ch in zip(inputs_list, ch_names):
for inputs_labels, obj in zip(inputs_labels_list, obj_names):
logger.info("Channel {0}: {1} applied...".format(ch, obj))
arr = np.ones((MAX_NUMCELL, len(PROP_SAVE), len(inputs)), np.float32) * np.nan
for frame, (path, pathl) in enumerate(zip(inputs, inputs_labels)):
img, labels = imread(path), lbread(pathl, nonneg=False)
cells = regionprops(labels, img)
if (labels < 0).any():
cells = add_parent(cells, labels)
[setattr(cell, 'frame', frame) for cell in cells]
cells = [Cell(cell) for cell in cells]
store.append(cells)
tarr = _cells2array(cells)
index = tarr[:, 1].astype(np.int32)
arr[index, :, frame] = tarr

logger.info("\tmaking dataframe...")
df = multi_index([i for ii in store for i in ii], obj, ch)
cellids = np.where(~np.isnan(arr[:, 0, :]).all(axis=1))[0]
marr = np.zeros((len(cellids), arr.shape[1], arr.shape[2]))
for pn, i in enumerate(cellids):
marr[pn] = arr[i]
sarr = np.swapaxes(marr, 0, 2)
narr = sarr.reshape((sarr.shape[0]*sarr.shape[1], sarr.shape[2]), order='F')
index = pd.MultiIndex.from_product([obj, ch, PROP_SAVE, range(arr.shape[-1])], names=['object', 'ch', 'prop', 'frame'])
df = pd.DataFrame(narr, index=index, columns=cellids)

if exists(join(output, 'df.csv')):
ex_df = pd.read_csv(join(output, 'df.csv'), index_col=['object', 'ch', 'prop', 'frame'])
ex_df.columns = pd.to_numeric(ex_df.columns)
Expand Down
51 changes: 49 additions & 2 deletions celltk/caller.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,37 @@
from utils.file_io import make_dirs
import sys



logger = logging.getLogger(__name__)

import os
from os.path import join, basename, exists, dirname
import collections

def multi_call(inputs):
contents = load_yaml(inputs)
pin = contents['PARENT_INPUT']
pin = pin[:-1] if pin.endswith('/') or pin.endswith('\\') else pin
input_dirs = [join(pin, i) for i in os.listdir(pin) if os.path.isdir(join(pin, i))]
contents_list = []
for subfolder in input_dirs:
conts = eval(str(contents).replace('$INPUT', subfolder))
conts['OUTPUT_DIR'] = join(conts['OUTPUT_DIR'], basename(subfolder))
contents_list.append(conts)
return contents_list


def convert(data):
if isinstance(data, basestring):
return str(data)
elif isinstance(data, collections.Mapping):
return dict(map(convert, data.iteritems()))
elif isinstance(data, collections.Iterable):
return type(data)(map(convert, data))
else:
return data


def extract_path(path):
f = glob(path)
Expand Down Expand Up @@ -82,7 +111,7 @@ def run_operation(output_dir, operation):
functions, params, images, labels, output = parse_operation(operation)
inputs = prepare_path_list(images, output_dir)
logger.info(inputs)

inputs_labels = prepare_path_list(labels, output_dir)
output = join(output_dir, output) if output else output_dir
caller = _retrieve_caller_based_on_function(functions[0])
Expand Down Expand Up @@ -119,6 +148,19 @@ def call_operations(contents):
logging.getLogger("PIL").setLevel(logging.WARNING)
run_operations(contents['OUTPUT_DIR'], contents['operations'])
logger.info("Caller finished.")
return


def _parallel(args):
'''
Use this function if you want to multiprocess using PARENT_INPUT argument
(see input_fireworks.yml).
'''
contents_list = multi_call(args.input[0])
contents_list = [convert(i) for i in contents_list]
pool = multiprocessing.Pool(args.cores, maxtasksperchild=1)
pool.map(call_operations, contents_list, chunksize=1)
pool.close()


def parse_args():
Expand All @@ -133,7 +175,12 @@ def parse_args():
def main():
args = parse_args()
if len(args.input) == 1:
single_call(args.input[0])
contents = load_yaml(args.input[0])
if "PARENT_INPUT" in contents:
_parallel(args)
else:
call_operations(contents)
# single_call(args.input[0])
if len(args.input) > 1:
num_cores = args.cores
print str(num_cores) + ' started parallel'
Expand Down
4 changes: 2 additions & 2 deletions celltk/command.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def main():
parser.add_argument("-l", "--inputs_labels", help="images", nargs="*", default=None)
parser.add_argument("-o", "--output", help="output directory", type=str, default='temp')
parser.add_argument("-f", "--functions", help="functions", nargs="*")
parser.add_argument("-p", "--param", nargs="*", help="parameters", default=[])
parser.add_argument("-p", "--param", nargs="+", help="parameters", action='append')
args = parser.parse_args()

params = ParamParser(args.param).run()
Expand All @@ -23,7 +23,7 @@ def main():

if len(args.functions) == 1 and args.functions[0] == 'apply':
pass
# ch_names = operation['ch_names'] if 'ch_names' in operation else images
# ch_names = operation['ch_names'] if 'ch_names' in operation else images
# obj_names = operation['obj_names'] if 'obj_names' in operation else labels
# caller(zip(*inputs), zip(*args.inputs_labels), args.output, obj_names, ch_names)
elif args.inputs_labels is None:
Expand Down
Empty file modified celltk/labeledarray/LICENSE
100755 → 100644
Empty file.
Empty file modified celltk/labeledarray/README.md
100755 → 100644
Empty file.
Empty file modified celltk/labeledarray/__init__.py
100755 → 100644
Empty file.
Empty file modified celltk/labeledarray/labeledarray/__init__.py
100755 → 100644
Empty file.
2 changes: 1 addition & 1 deletion celltk/labeledarray/labeledarray/labeledarray.py
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def _label2idx(self, item):
if boolarr.all():
return (slice(None, None, None), ) + (slice(None, None, None), ) * (self.ndim - 1)
minidx = min(tidx) if min(tidx) > 0 else None
maxidx = max(tidx) if max(tidx) < self.shape[0] - 1 else None
maxidx = max(tidx)+1 if max(tidx)+1 < self.shape[0] else None
if boolarr.sum() > 1:
return (slice(minidx, maxidx, None), ) + (slice(None, None, None), ) * (self.ndim - 1)

Expand Down
Empty file modified celltk/labeledarray/labeledarray/utils.py
100755 → 100644
Empty file.
Loading

0 comments on commit 6834cda

Please sign in to comment.