Skip to content

Commit

Permalink
Merge pull request datajoint-company#60 from vathes/master
Browse files Browse the repository at this point in the history
Ingestion routine updates, histology tables and qc tables.
  • Loading branch information
shenshan authored Dec 16, 2020
2 parents ce6d79f + 9f8c5f2 commit 2240654
Show file tree
Hide file tree
Showing 104 changed files with 3,750 additions and 304 deletions.
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
data/*
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@

# some unused files
rasters/*
notebooks/.*
notebooks/notebooks_plotting/*
scripts/update_entries.py
scripts/compare_tables.py
Expand Down
22 changes: 11 additions & 11 deletions Dockerfile
100755 → 100644
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
FROM datajoint/jupyter:python3.6
FROM datajoint/djlab:py3.7-debian

RUN pip install --upgrade pip
RUN pip install --upgrade datajoint

ADD . /src/IBL-pipeline

USER root
RUN pip install -e /src/IBL-pipeline

RUN pip install globus_sdk
RUN pip install plotly
RUN pip install statsmodels
RUN pip install scikits.bootstrap

RUN pip install ibllib
RUN pip install "git+https://github.com/ixcat/djwip.git#egg=djwip"

ADD ./allen_structure_tree.csv /usr/local/lib/python3.6/dist-packages/ibllib/atlas
RUN pip uninstall opencv-python -y
RUN conda install -c conda-forge opencv -y
COPY --chown=dja:anaconda ./apt_requirements.txt /tmp/apt_requirements.txt
RUN apt update
USER dja:anaconda
RUN \
/entrypoint.sh echo "Requirements updated..." && \
rm "${APT_REQUIREMENTS}"
183 changes: 122 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,89 @@
# Getting started with DataJoint for IBL #

1. Email [email protected] for a database username.
# Identify your role

This is important to identify how you would work with IBL-pipeline. There are two typical roles:

1. User:
>* Internal user: an IBL user who would like to utilize the IBL-pipeline to query the IBL database for research and create downstream tables for their own analyses in the IBL database, but a user will not contribute to the development of the main IBL pipeline.
>* External user: similar to an internal user, but an external user will not use IBL database to access data, but would like to adopt the database schemas and tables from IBL pipeline.
2. Developer: besides the actions of the users, a developer would like to contribute to the daily ingestion, computation, and plotting of the IBL-pipeline.


# Instruction for users

1. Get credentials to the database server
> For an IBL internal user, contact Shan Shen via [email protected] or Slack for a username and an initial password to the IBL database. You can change your password with
```
import datajoint as dj
dj.set_password()
```
> For an external user, set up your own database server and here is an [instruction](https://docs.datajoint.io/python/admin/1-hosting.html).
2. Install IBL-pipeline python package

> Install the package with pip, this gives the latest version. Use pip3 instead of pip does not work properly.
```
pip install ibl-pipeline
```
> To upgrade to the latest version,
```
pip install --upgrade ibl-pipeline
```
> After the installation, `datajoint` and `ibl_pipeline` could be imported as regular modules
3. Set up the configuration of DataJoint.
> Now you have successfully installed datajoint and ibl_pipeline package, to properly connect to the database server, set up the configuration by specifying dj.config.
```
shanshen@Shans-MacBook-Pro:~$ ipython
In [1]: import datajoint as dj
In [2]: dj.config
Out[2]:
{ 'connection.charset': '',
'connection.init_function': None,
'database.host': 'localhost',
'database.password': None,
'database.port': 3306,
'database.reconnect': True,
'database.user': None,
'display.limit': 12,
'display.show_tuple_count': True,
'display.width': 14,
'fetch_format': 'array',
'loglevel': 'INFO',
'safemode': True}
```
> The default value of dj.config is shown as above. You will need to change the fields:
```
dj.config['database.host'] = 'datajoint.internationalbrainlab.org'
dj.config['database.user'] = 'YOUR_USERNAME'
dj.config['database.password'] = 'YOUR_PASSWORD'
```

> Then save the configuration as a json file with either dj.config.save_local(), or dj.config.save_global(). If saved globally, this configuration will be applied in all directories. If saved locally, it only applies when you under your current directory. The configuration will be saved as a json file dj_local_conf.json in the current directory. You don’t need to set up the configuration the next time.
> You can start using ibl_pipeline by importing modules, such as:
```
from ibl_pipeline import reference, subject, action, acquisition, data, behavior, ephys, histology
```

4. Special notes: the IBL-pipeline is under active development, the tables of interests may have already existed in the database before the latest version of ibl-pipeline is released. To get access to the latest tables, we also recommend using `dj.create_virtual_module`. The syntax to create a virtual module is as follows:
```
behavior = dj.create_virtual_module('behavior', 'ibl_behavior')
```

> Then `behavior` could be used to access any table:
```
behavior.TrialSet()
```

# Instruction for developers

1. Email [email protected] for a database username and initial password.

2. Install Docker (https://www.docker.com/). Linux users also need to install Docker Compose separately. For Mac: https://docs.docker.com/docker-for-mac/.

Expand All @@ -19,27 +102,26 @@ If you don't have SSH setup, use `git clone https://github.com/YourUserName/IBL-
DJ_PASS=password
```
6. Now let's set up the docker container that have the entire environment.
6. Now let's set up the docker container that have the entire environment.
Copy `docker-compose-template.yml` as `docker-compose.yml` - this is your own file you can customize.
> Copy `docker-compose-template.yml` as `docker-compose.yml` - this is your own file you can customize.
Note: There is a similar file called `docker-compose-local_template.yml`. You will not need it unless you would like to perform ingestion from scratch in the database hosted on your own machine.
> Note: There is a similar file called `docker-compose-local_template.yml`. You will not need it unless you would like to perform ingestion from scratch in the database hosted on your own machine.
There are two properties that you may want to customize.
> There are two properties that you may want to customize.
First, to save figures in a folder outside your `IBL-pipeline` docker folder (which is good practice so you don't clutter up the Github repo), you can tell Docker to create an alias older which points to your preferred place for storing figures.
> First, to save figures in a folder outside your `IBL-pipeline` docker folder (which is good practice so you don't clutter up the Github repo), you can tell Docker to create an alias older which points to your preferred place for storing figures.
a. `open docker-compose.yml`
b. add `myFullPath:/Figures_DataJoint_shortcuts` in to the `volumes:`, where `myFullPath` could for example be `~/Google Drive/Rig building WG/DataFigures/BehaviourData_Weekly/Snapshot_DataJoint/`
c. close the file
Then save the plots from Python into `/Figures_DataJoint_shortcuts` inside the docker, then you’ll see that the plots are in the folder you want.
b. add any folder you would like to access within the docker container in to the `volumes:`
for example '~/Documents/ephys_data:/ephys_data'
c. close the file
Second, Set up your `.one_params`.
> Second, Set up your `.one_params`.
If you have your `.one_params` in your root directory `~/.one_params`, you can directly go to Ste[ 7]. If you have your `.one_params` in another directory, please change the mapping `docker-compose.yml`
> If you have your `.one_params` in your root directory `~/.one_params`, you can directly go to Step 7. If you have your `.one_params` in another directory, please change the mapping `docker-compose.yml`
in the `volumes:` section `your-directory-to-one_params/.one_params: /root/.one_params`.
After your are done with these customization, you are ready to start the docker container, by running:
Expand All @@ -52,9 +134,9 @@ Note: Anytime you would like to change the mapping from an outside folder to a d
## To run your own Python scripts ##
7. After running the docker container, you may want to use enter the container to run your own script. The command is `docker exec -it ibl-pipeline_datajoint_1 /bin/bash`. You would then enter the container with the current directory `/notebooks`. You can use `cd` to navigate inside the docker container.
7. After running the docker container, you may want to use enter the container to run your own script. The command is `docker exec -it ibl-pipeline_datajoint_1 /bin/bash`. You would then enter the container with the current directory `/notebooks`. You can use `cd` to navigate inside the docker container.
Note: If you would like to go to a specific folder, for example `prelim_analyses/behavioral_snapshots`at the same time when you run `docker exec`, you can use this command line: `docker exec -it docker exec -it ibl-pipeline_datajoint_1 bash -c "cd /src/IBL-pipeline/prelim_analyses/behavioral_snapshots; exec /bin/bash"`
> Note: If you would like to go to a specific folder, for example `prelim_analyses/behavioral_snapshots`at the same time when you run `docker exec`, you can use this command line: `docker exec -it docker exec -it ibl-pipeline_datajoint_1 bash -c "cd /src/IBL-pipeline/prelim_analyses/behavioral_snapshots; exec /bin/bash"`
8. To simplify the process of setting up the docker environment, we prepared a bash script `ibl_docker_setup-template.sh`. You may first want to copy this template by `cp ibl_docker_setup-template.sh ibl_docker_setup.sh`, then customize your own `ibl_docker_setup.sh`. In the file, you can change the directory you want to go to in the last line. The default command in the last line is: `docker exec -it docker exec -it ibl-pipeline_datajoint_1 bash -c "cd /src/IBL-pipeline/prelim_analyses/; exec /bin/bash"`, which goes to the folder `IBL-pipeline/prelim_analyses`. You can replace this directory with the directory you would like to go to.
Expand All @@ -75,73 +157,52 @@ python behavioral_snapshot.py
10. Go to http://localhost:8888/tree in your favorite browser to open Jupyter Notebook.
11. Open "Datajoint pipeline query tutorial.ipynb".
11. Open the directory `notebooks_tutorial` and feel free to go to through the tutorials.
12. Run through the notebook and feel free to experiment.
### Staying up-to date ###
To stay up-to-date with the latest code from DataJoint, you might first want to check by `git remote -v`.
To stay up-to-date with the latest code from DataJoint, you might first want to check by `git remote -v`.
If there is no upstream pointing to the int-brain-lab repository, then do `git remote add upstream https://github.com/int-brain-lab/IBL-pipeline`.
Then `git pull upstream master` will make sure that your local fork stays up to date with the original repo.
#### Contributing code ####
If you feel happy with the changes you've made, you can add, commit and push them to your own branch. Then go to https://github.com/int-brain-lab/IBL-pipeline, click 'Pull requests', 'New pull request', 'compare across forks', and select your fork of `IBL-pipeline`. If there are no merge conflicts, you can click 'Create pull request', explain what changes/contributions you've made, and and submit it to the DataJoint team for approval.
---
# Instructions to ingest Alyx data into local database #
To run an local instance of database in the background, run the docker-compose command as follows:
```bash
docker-compose -f docker-compose-local.yml up -d
```

This will create a docker container with a local database inside. To access the docker from the terminal, first get the docker container ID with `docker ps`, then run:
If you feel happy with the changes you've made, you can add, commit and push them to your own branch. Then go to https://github.com/int-brain-lab/IBL-pipeline, click 'Pull requests', 'New pull request', 'compare across forks', and select your fork of `IBL-pipeline`. If there are no merge conflicts, you can click 'Create pull request', explain what changes/contributions you've made, and and submit it to the DataJoint team for approval.
```bash
docker exec -it CONTAINER_ID /bin/bash
```
Now we are in the docker, and run the bash script for the ingestion:
```
bash /src/ibl-pipeline/scripts/ingest_alyx.sh ../data/alyx_dump/2018-10-30_alyxfull.json
```

Make sure that the json file is in the correct directory as shown above.
# IBL pipeline schemas #
To turn stop the containers, run:
Schema of `reference`:
![Reference Diagram](images/ephys.png)
```bash
docker-compose -f docker-compose-local.yml down
```
Schema of `subject`:
![Subject Diagram](images/subject.png)
# Instructions to ingest Alyx data into Amazon RDS
Schema of `action`:
![Action Diagram](images/action.png)
To insert Alyx data into the remote Amazon RDS, create a .env file in the same directory of your `docker-compose.yml`, as instructed in Step 4 above.
Schema of `acquisition`:
![Acquisition Diagram](images/acquisition.png)
Now run the docker-compose as follows, it will by default run through the file `docker-compose.yml`
Schema of `data`:
![DataDiagram](images/data.png)
```bash
docker-compose up -d
```
Schema of `behavior`
![Behavior erd](images/behavior.png)
This will create a docker container and link to the remote Amazon RDS. Then follow the same instruction of ingestion to the local database.
Schema of `behavior_analyses`:
![Behavior analyses Diagram](images/behavior_analyses.png)
# IBL pipeline schemas #
Schema of `ephys`
![Ephys erd](images/ephys.png)
Alyx-corresponding schemas, including, `referenall_erd.save('/images/all_erd.png')ce`, `subject`, `action`, `acquisition`, and `data`
Schema of `histology`:
![Histology Diagram](images/histology.png)
![Alyx_corresponding erd](images/alyx_erd.png)
Schema of `qc`:
Schema of `ephys`
![Ephys erd](images/ephys_erd.png)
Schema of `behavior`
![Behavior erd](images/behavior_erd.png)
![Quality check Diagram](images/qc.png)
1 change: 1 addition & 0 deletions apt_requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
libgl1-mesa-glx
9 changes: 7 additions & 2 deletions docker-compose-db-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,15 @@ services:
container_name: ibl_datajoint_dbtest
env_file: .env_dbtest
volumes:
- ./notebooks:/notebooks
- ./notebooks:/home/dja
- ./images:/images
- .:/src/IBL-pipeline
- ./data:/data
- ./root/.one_params:/root/.one_params
user: 1000:anaconda
ports:
- "8400:8888"
- "8920:8888"
networks:
- ibl_dbtest
networks:
ibl_dbtest:
9 changes: 7 additions & 2 deletions docker-compose-local-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,19 +7,24 @@ services:
- DJ_USER=root
- DJ_PASS=simple
volumes:
- ./notebooks:/notebooks
- ./notebooks:/home/dja
- ./images:/images
- .:/src/IBL-pipeline
- ./data:/data
- ./root/.one_params:/root/.one_params
- ./snapshots:/Snapshot_DataJoint_shortcut
links:
- db
user: 1000:anaconda
ports:
- "8888:8888"
networks:
- ibl_local

db:
image: datajoint/mysql
environment:
- MYSQL_ROOT_PASSWORD=simple
ports:
- "4306:3306"
networks:
ibl_local
8 changes: 6 additions & 2 deletions docker-compose-public.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,15 @@ services:
container_name: ibl_datajoint_public
env_file: .env_public
volumes:
- ./notebooks:/notebooks
- ./notebooks:/home/dja
- ./images:/images
- .:/src/IBL-pipeline
- ./data:/data
- ./root/.one_params:/root/.one_params
- ./snapshots:/Figures_DataJoint_shortcuts
user: 1000:anaconda
ports:
- "8300:8888"
networks:
- ibl_public
networks:
ibl_public:
10 changes: 7 additions & 3 deletions docker-compose-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,15 @@ services:
build: .
env_file: .env
volumes:
- ./notebooks:/notebooks
- ./notebooks:/home/dja
- ./images:/images
- .:/src/IBL-pipeline
- ./data:/data
- ~/.one_params:/root/.one_params
- ./snapshots:/Figures_DataJoint_shortcuts
- ./root/.one_params:/home/dja/.one_params
user: 1000:anaconda
ports:
- "8888:8888"
networks:
- ibl
networks:
ibl:
17 changes: 17 additions & 0 deletions docker-compose-test-new.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
version: '3'
services:
datajoint_test_new:
build:
context: .
dockerfile: Dockerfile_new
container_name: ibl_datajoint_test_new
env_file: .env_test
volumes:
- ./notebooks:/home/dja
- ./images:/images
- .:/src/IBL-pipeline
- ./data:/data
- ./root/.one_params:/home/dja/.one_params
user: 1000:anaconda
ports:
- "9999:8888"
9 changes: 7 additions & 2 deletions docker-compose-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,15 @@ services:
container_name: ibl_datajoint_test
env_file: .env_test
volumes:
- ./notebooks:/notebooks
- ./notebooks:/home/dja
- ./images:/images
- .:/src/IBL-pipeline
- ./data:/data
- ./root/.one_params:/root/.one_params
user: 1000:anaconda
ports:
- "8900:8888"
- "9999:8888"
networks:
- ibl_test
networks:
ibl_test:
Loading

0 comments on commit 2240654

Please sign in to comment.