Use the Allen Brain Observatory – Visual Coding on AWS

The Allen Brain Observatory – Visual Coding is a large-scale, standardized survey of physiological activity across the mouse visual cortex, hippocampus, and thalamus. It includes datasets collected with both two-photon imaging and Neuropixels probes, two complementary techniques for measuring the activity of neurons in vivo.

The two-photon imaging dataset features visually evoked calcium responses from GCaMP6-expressing neurons in a range of cortical layers, visual areas, and Cre lines. The Neuropixels dataset features spiking activity from distributed cortical and subcortical brain regions, collected under analogous conditions to the two-photon imaging experiments. We hope that experimentalists and modelers will use these comprehensive, open datasets as a testbed for theories of visual information processing.

In addition to the Neurodata Without Borders (NWB) files containing fluorescence traces or spike times, we are excited to make the (mostly) raw data available as well. Our S3 bucket includes the motion-corrected calcium fluorescence videos for all two-photon imaging sessions and continuous voltage traces for every Neuropixels experiment. This will allow users to test their own analysis algorithms on the complete, unabridged data files without having to ship physical hard disks.

Experiment Organization

The two-photon imaging dataset is organized around "experiment containers" for one area/layer/Cre-line combination from one mouse. Each container includes recordings from a single population of neurons while the mouse passively observes a battery of visual stimuli in three separate ~90 minute sessions. The three sessions consist of interleaved presentations of the following stimuli:

drifting gratings, natural movies (clip 1, clip 3)
static gratings, natural movies (clip 1), natural scenes/images
locally sparse noise, natural movies (clip 1, clip 2) Each session has a separate NWB file with cell-level response data, for a total of 3 NWB files in each experiment container. Learn more about the design of the experiment, the stimulus protocol of each session, and the organization of the data on our web site and in our technical whitepapers.

The Neuropixels dataset includes one session per mouse, with spikes recorded simultaneously from up to 6 cortical visual areas, hippocampus, thalamus, and other adjacent structures. Mice passively view either the same stimuli as in the two-photon imaging experiments, or a subset of stimuli shown with a higher number of repeats. Each session has one NWB file containing spike times, spike waveforms, experiment metadata, and information about the visual stimulus. Data for the local field potential is stored in one NWB file per probe. You can learn more about our Neuropixels experiments in this technical whitepaper

Access Data in S3 via Allen SDK

The Allen Brain Observatory data set is hosted on Amazon Web Services (AWS) in an S3 bucket. In order to use the data set, you need to have an AWS account. You can create an AWS account by following these instructions.

Bucket Organization

The S3 bucket is located here: arn:aws:s3:::allen-brain-observatory. It is organized as follows:

visual-coding-2p
+-- manifest.json               # used by AllenSDK to look up file paths
+-- experiment_containers.json  # metadata for each container (area, imaging depth, etc)
+-- ophys_experiments.json      # metadata for each experiment session
+-- cell_specimens.json         # metadata for each recorded cell
+-- stimulus_mappings.json      # metadata for each computed metric or summary image
+-- ophys_experiment_data       # traces, running speed, etc per experiment session
¦   +-- <experiment_id>.nwb
¦   +-- ...
+-- ophys_experiment_analysis   # analysis files per experiment session
¦   +-- <experiment_id>_<session_name>.h5
¦   +-- ...
+-- ophys_movies                # motion-corrected video per experiment session
¦   +-- ophys_experiment_<experiment_id>.h5
¦   +-- ...
+-- ophys_experiment_events     # neuron activity modeled as discrete events
¦   +-- <experiment_id>_events.npz
¦   +-- ...
+-- ophys_eye_gaze_mapping      # subject eye position over the course of the experiment
¦   +-- <experiment_id>_<session_id>_eyetracking_dlc_to_screen_mapping.h5
¦   +-- ...
visual-coding-neuropixels
+-- ecephys-cache                  # packaged processed ExtraCellular Electrophysiology data
¦   +-- manifest.json              # used by AllenSDK to look up file paths
¦   +-- sessions.csv               # metadata for each experiment session
¦   +-- probes.csv                 # metadata for each experiment probe
¦   +-- channels.csv               # metadata for each location on a probe
¦   +-- units.csv                  # metadata for each recorded signal
¦   +-- brain_observatory_1.1_analysis_metrics.csv         # pre-computed metrics for brain observatory stimulus set
¦   +-- functional_connectivity_analysis_metrics.csv       # pre-computed metrics for functional connectivity stimulus set
¦   +-- session_<experiment_id>
¦   ¦   +-- session_<experiment_id>.nwb                    # experiment session nwb
¦   ¦   +-- probe_<probe_id>_lfp.nwb                       # probe lfp nwb
¦   ¦   +-- session_<experiment_id>_analysis_metrics.csv   # pre-computed metrics for experiment
¦   +-- ...
¦   +-- natural_movie_templates
¦   ¦   +-- natural_movie_1.h5                    # stimulus movie
¦   ¦   +-- natural_movie_3.h5                    # stimulus movie
¦   +-- natural_scene_templates
¦   ¦   +-- natural_scene_<image_id>.tiff         # stimulus image
¦   ¦   +-- ...
+-- raw-data                       # Sorted spike recordings and unprocessed data streams
¦   +-- <experiment_id>
¦   ¦   +-- sync.h5                # Information describing the synchronization of experiment data streams
¦   ¦   +-- <probe_id>
¦   ¦   ¦ +-- channel_states.npy   #
¦   ¦   ¦ +-- event_timestamps.npy #
¦   ¦   ¦ +-- lfp_band.dat         # Local field potential data
¦   ¦   ¦ +-- spike_band.dat       # Spike data
¦   ¦   +-- ...
¦   +-- ...

Jupyter Notebook Launch Instructions

The instructions below walk through the steps necessary for creating a Jupyter notebook instance and using this data set. A couple important points:

The template below uses a customized AWS SageMaker template that comes preconfigured with a large number of environments. We pre-installed allensdk into the conda_python3 environment
Make sure you create your instance in the us-west-2 region -- that's where our bucket lives.

Option 1: Create a Jupyter Notebook Instance via a Launch Button

Click on
Enter values for Stack name (default is allen-brain-observatory) and Username
Continue clicking next until you get to the review page.
On the review page, check the checkbox that allows the AWS Cloudformation to create roles and click on Create. You will be redirected to the Cloudformation page. Wait for the template to be created.

You can check the status of the notebook instance in the SageMaker console. The URL of the notebook instance is the following: https://allen-brain-observatory-[USERNAME].notebook.us-west-2.sagemaker.aws/tree.

Option 2: Create a Jupyter Notebook Instance via the AWS CLI

Install the AWS CLI by following these instructions.
Configure your machine to use your AWS account by following these instructions.
Download the template from here.
Run the following command in the directory where you downloaded the template and wait for the instance to be created.

aws cloudformation create-stack --stack-name allen-brain-observatory --parameters ParameterKey=Username,ParameterValue={username} --template-body file://./allen-brain-observatory-sagemaker.yml --capabilities CAPABILITY_IAM

You can check the status of the notebook instance here. The URL of the notebook instance is the following: https://allen-brain-observatory-[USERNAME].notebook.us-west-2.sagemaker.aws/tree

Play with the Data

Once your notebook is running (remember to use the conda_python3 environment!), you can access frames of a video like this:

# imports 
import h5py
import matplotlib.pyplot as plt
from allensdk.core.brain_observatory_cache import BrainObservatoryCache
%matplotlib inline

# find an ophys experiment
boc = BrainObservatoryCache(manifest_file='/data/allen-brain-observatory/visual-coding-2p/manifest.json')
exps = boc.get_ophys_experiments()
exp = exps[0]

# pull some frames out of the movie
movie_path = '/data/allen-brain-observatory/visual-coding-2p/ophys_movies/ophys_experiment_%d.h5' % exp['id']
f = h5py.File(movie_path,'r')
frames = f["data"][:10,:,:]
f.close()
plt.imshow(frames[0])
plt.show()

You can also access the dF/F traces of an experiment like this:

ds = boc.get_ophys_experiment_data(exp['id']
t, dff = ds.get_dff_traces()
plt.plot(t, dff[0])
plt.show()

For more detailed examples, take a look at the Allen SDK documentation page.

Contact

If you have questions about the data or the Allen SDK, open an issue on the Allen SDK GitHub issue tracker. You can also ask a question on the Allen Brain Map Community Forum or on Stack Overflow with the ‘allen-sdk’ tag.

License

The data in this data set is provided under a non-commercial use policy. Look at our terms of use for more details: http://www.alleninstitute.org/legal/terms-use/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly