Skip to content

Commit

Permalink
New cyto tool: create cell locations file (#257)
Browse files Browse the repository at this point in the history
* Init

* cmdline tool

* Cleanup

* use fire

* output is optional

* fix sqlite fixture

* refactor

* files are optional

* Typo

* cleanup

* drop comments

* create test fixtures

* checks

* fix paths

* Use fixtures

* yield

* download file if needed

* typo

* Update req

* use boto3 session

* Test s3 locations

* skip s3 test

* dtypes

* Add mike's snipper

* use mike's format

* tests pass

* use method

* add alternatives

* refactor mike's code

* cleanup

* better tests

* cleanup

* overwrite is an option, other cleanup

* Update pycytominer/cyto_utils/cell_locations.py

Co-authored-by: Gregory Way <[email protected]>

* add docs

* Add deps

* Move to setup

* add fire to deps

* use pathlib

* Add pip install .[cell_locations] (+formatting)

* Update docs + fix typo in actions

* Use as module

* Formatting

* add  cell_locations

* Merge in SQL

* Update README.md

Co-authored-by: Dave Bunten <[email protected]>

* Update .github/workflows/codecov.yml

Co-authored-by: Dave Bunten <[email protected]>

* Update .github/workflows/python-app.yml

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/tests/test_data/cell_locations_example_data/test_cell_locations.sh

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/tests/test_data/cell_locations_example_data/test_cell_locations.sh

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/tests/test_cyto_utils/test_cell_locations.py

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/cyto_utils/cell_locations.py

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/cyto_utils/cell_locations.py

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/tests/test_cyto_utils/test_cell_locations.py

Co-authored-by: Dave Bunten <[email protected]>

* Update pycytominer/tests/test_cyto_utils/test_cell_locations.py

Co-authored-by: Dave Bunten <[email protected]>

* Address various comment

* To address this warning below:

pycytominer/tests/test_cyto_utils/test_cell_locations.py::test_shape_and_columns[cell_loc1]
  /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/sql.py:1405: RemovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
    return self.connectable.execution_options().execute(*args, **kwargs)

* use sqlalchemy

* More comments and switch to boto3.client

* Be explicit about anon; fix indentation bug

* explicit types

* Address various comment

* Move gitignore entries to the top level

* rename files, add docs

* Upgrade to python 3.10

* refactor _load_single_cell

* fix type

* test on highest build version

* explain warning

* s3 is an attribute

* compact check

* Update README.md

Co-authored-by: Dave Bunten <[email protected]>

* fix typo + more docs

* black cells.py

* trim code

* black

* Skip test

* docs

* dtypes

* skip test

* Fix test

* Add docs

---------

Co-authored-by: Gregory Way <[email protected]>
Co-authored-by: Dave Bunten <[email protected]>
  • Loading branch information
3 people authored Apr 7, 2023
1 parent c90438f commit a5ae6c8
Show file tree
Hide file tree
Showing 15 changed files with 855 additions and 80 deletions.
47 changes: 23 additions & 24 deletions .github/workflows/codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,32 @@ name: Code coverage

on:
push:
branches: [ master ]
branches: [master]
pull_request:
branches: [ master ]
branches: [master]

jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: Setup Python
uses: actions/setup-python@master
with:
python-version: 3.7
- name: Generate coverage report
run: |
pip install pytest
pip install pytest-cov
pip install -r requirements.txt
pip install .[collate]
pytest --cov=./ --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
files: ./coverage1.xml,./coverage2.xml
directory: ./coverage/reports/
flags: unittests
name: codecov-umbrella
fail_ci_if_error: false
path_to_write_report: ./coverage/codecov_report.gz
- uses: actions/checkout@master
- name: Setup Python
uses: actions/setup-python@master
with:
python-version: 3.9
- name: Generate coverage report
run: |
pip install pytest
pip install pytest-cov
pip install .[collate,cell_locations]
pytest --cov=./ --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
files: ./coverage1.xml,./coverage2.xml
directory: ./coverage/reports/
flags: unittests
name: codecov-umbrella
fail_ci_if_error: false
path_to_write_report: ./coverage/codecov_report.gz
38 changes: 20 additions & 18 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,35 @@ name: Python build

on:
push:
branches: [ master ]
branches: [master]
pull_request:
branches: [ master ]
branches: [master]

jobs:
build:

runs-on: ${{ matrix.os }}
strategy:
matrix:
python-version: [3.7, 3.8, 3.9]
os: [ubuntu-latest, macos-latest]
env:
OS: ${{ matrix.os }}
OS: ${{ matrix.os }}
# This is needed to avoid a warning from SQLAlchemy
# https://sqlalche.me/e/b8d9
# We can remove this once we upgrade to SQLAlchemy >= 2.0
SQLALCHEMY_SILENCE_UBER_WARNING: "1"
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
pip install .[collate]
- name: Test with pytest
run: |
pytest
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest
pip install .[collate,cell_locations]
- name: Test with pytest
run: |
pytest
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ build
*.sqlite
pycytominer/tests/test_data/collate/backend/**/*.csv
!pycytominer/tests/test_data/collate/backend/**/*master.csv
!pycytominer/tests/test_data/cell_locations_example_data/*.sqlite

40 changes: 39 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,12 @@ Since the project is actively being developed, with new features added regularly
# Example:
pip install git+git://github.com/cytomining/pycytominer@2aa8638d7e505ab510f1d5282098dd59bb2cb470
```

### CSV collation

If running your images on a cluster, unless you have a MySQL or similar large database set up then you will likely end up with lots of different folders from the different cluster runs (often one per well or one per site), each one containing an `Image.csv`, `Nuclei.csv`, etc.
In order to look at full plates, therefore, we first need to collate all of these CSVs into a single file (currently SQLite) per plate.
We currently do this with a library called [cytominer-database](https://github.com/cytomining/cytominer-database).
We currently do this with a library called [cytominer-database](https://github.com/cytomining/cytominer-database).

If you want to perform this data collation inside pycytominer using the `cyto_utils` function `collate` (and/or you want to be able to run the tests and have them all pass!), you will need `cytominer-database==0.3.4`; this will change your installation commands slightly:

Expand All @@ -62,6 +63,43 @@ pip install "pycytominer[collate] @ git+git://github.com/cytomining/pycytominer@

If using `pycytominer` in a conda environment, in order to run `collate.py`, you will also want to make sure to add `cytominer-database=0.3.4` to your list of dependencies.

## Creating a cell locations lookup table

The `CellLocation` class offers a convenient way to augment a [LoadData](https://cellprofiler-manual.s3.amazonaws.com/CPmanual/LoadData.html) file with X,Y locations of cells in each image.
The locations information is obtained from a single cell SQLite file.

To use this functionality, you will need to modify your installation command, similar to above:

```bash
# Example for general case commit:
pip install "pycytominer[cell_locations] @ git+git://github.com/cytomining/pycytominer"
```

Example using this functionality:

```bash
metadata_input="s3://cellpainting-gallery/test-cpg0016-jump/source_4/workspace/load_data_csv/2021_08_23_Batch12/BR00126114/test_BR00126114_load_data_with_illum.parquet"
single_single_cell_input="s3://cellpainting-gallery/test-cpg0016-jump/source_4/workspace/backend/2021_08_23_Batch12/BR00126114/test_BR00126114.sqlite"
augmented_metadata_output="~/Desktop/load_data_with_illum_and_cell_location_subset.parquet"

python \
-m pycytominer.cyto_utils.cell_locations_cmd \
--metadata_input ${metadata_input} \
--single_cell_input ${single_single_cell_input} \
--augmented_metadata_output ${augmented_metadata_output} \
add_cell_location

# Check the output

python -c "import pandas as pd; print(pd.read_parquet('${augmented_metadata_output}').head())"

# It should look something like this (depends on the width of your terminal):

# Metadata_Plate Metadata_Well Metadata_Site ... PathName_OrigRNA ImageNumber CellCenters
# 0 BR00126114 A01 1 ... s3://cellpainting-gallery/cpg0016-jump/source_... 1 [{'Nuclei_Location_Center_X': 943.512129380054...
# 1 BR00126114 A01 2 ... s3://cellpainting-gallery/cpg0016-jump/source_... 2 [{'Nuclei_Location_Center_X': 29.9516027655562...
```

## Usage

Using pycytominer is simple and fun.
Expand Down
Loading

0 comments on commit a5ae6c8

Please sign in to comment.