Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example describing the structure of the JSON/dict input format for Dataset object #911

Merged
merged 3 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion examples/01_datasets/01_plot_dataset_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@
# Datasets are stored as json or pkl[.gz] files
# -----------------------------------------------------------------------------
# Json files are used to create Datasets, while generated Datasets are saved
# to, and loaded from, pkl[.gz] files.
# to, and loaded from, pkl[.gz] files. Find more information on the JSON file format
# here :ref:`datasets_json` section.
# We use jsons because they are easy to edit, and thus build by hand, if
# necessary.
# We then store the generated Datasets as pkl.gz files because an initialized
Expand Down
164 changes: 164 additions & 0 deletions examples/01_datasets/06_plot_dataset_json.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
"""

.. _datasets_json:

===============================================
Create a NiMARE Dataset object from a JSON file
===============================================

Here, we present the structure of the JSON/dict input format to create a
:class:`~nimare.dataset.Dataset` class from scratch.
"""

###############################################################################
# Data Structure
# -----------------------------------------------------------------------------
# A JSON file is organized as a nested dictionary structure containing neuroimaging
# study data. Each study may contain one or multiple contrasts. Each contrast could
# have coordinates, associated images, metadata, and text information. The data will
# be mapped to five core DataFrames: annotations, coordinates, images, metadata, and texts.

###############################################################################
# Study Dictionary
# -----------------------------------------------------------------------------
# - Every study is assigned a unique identifier (study_id), which is defined by
# the user (e.g., "pain_01.nidm").
# - Every study contains a ``contrasts`` dictionary holding all the contrasts for that
# study. Each contrast is assigned a unique identifier (contrast_id), which is
# also defined by the user (e.g., "1").
#
# .. code-block:: python
#
# {
# "<study_id>": {
# "contrasts": {
# "<contrast_id>": {
# "annotations": {...},
# "coords": {...},
# "images": {...},
# "metadata": {...},
# "text": {...},
# }
# }
# }
# }

###############################################################################
# Contrast Dictionary
# -----------------------------------------------------------------------------
# Each contrast contains five main dictionaries:

###############################################################################
# 1. **Annotations Dictionary** (`annotations`)
# `````````````````````````````````````````````````````````````````````````````
# - Contains labels and annotations.
# - Optional for studies.
#
# .. code-block:: python
#
# "annotations": {
# "label": str, # Label for the contrast
# "description": str # Description of the contrast
# }

###############################################################################
# 2. **Coordinates Dictionary** (`coords`)
# `````````````````````````````````````````````````````````````````````````````
# - Includes space information and x, y, and z coordinates.
#
# .. code-block:: python
#
# "coords": {
# "space": str, # e.g., "MNI"
# "x": List[float], # x-coordinates
# "y": List[float], # y-coordinates
# "z": List[float] # z-coordinates
# }


###############################################################################
# 3. **Images Dictionary** (`images`)
# `````````````````````````````````````````````````````````````````````````````
# - Contains paths to statistical maps. Possible keys are "beta", "se", "t", and "z".
#
# .. code-block:: python
#
# "images": {
# "beta": str, # Path to contrast image
# "se": str, # Path to standard error image
# "t": str, # Path to t-statistic image
# "z": str # Path to z-statistic image
# }

###############################################################################
# 4. **Metadata Dictionary** (`metadata`)
# `````````````````````````````````````````````````````````````````````````````
# - Contains study-specific metadata.
# - Flexible schema for user-defined metadata.
#
# .. code-block:: python
#
# "metadata": {
# "sample_sizes": List[int]
# }

###############################################################################
# 5. **Text Dictionary** (`text`)
# `````````````````````````````````````````````````````````````````````````````
# - Contains study/contrast text information.
# - Contains keys associated with the linked publication.
#
# .. code-block:: python
#
# "text": {
# "title": str, # Study title
# "keywords": str, # Study keywords
# "abstract": str, # Study abstract
# "body": str # Main study text/content
# }

###############################################################################
# Example JSON
# -----------------------------------------------------------------------------
# Load the example dataset JSON file

import json
import os

from nimare.utils import get_resource_path

dset_file = os.path.join(get_resource_path(), "nidm_pain_dset.json")

with open(dset_file, "r") as f_obj:
data = json.load(f_obj)

###############################################################################
# Example of accessing coordinates for a study
study_coords = data["pain_01.nidm"]["contrasts"]["1"]["coords"]
x_coords = study_coords["x"]
y_coords = study_coords["y"]
z_coords = study_coords["z"]
print(x_coords[:5], y_coords[:5], z_coords[:5])

###############################################################################
# Example of accessing image paths
study_images = data["pain_01.nidm"]["contrasts"]["1"]["images"]
beta_image_path = study_images["beta"]
t_stat_path = study_images["t"]
print(beta_image_path, t_stat_path)

###############################################################################
# Example of accessing metadata
study_metadata = data["pain_01.nidm"]["contrasts"]["1"]["metadata"]
sample_size = study_metadata["sample_sizes"][0]
print(sample_size)

###############################################################################
# .. note::
# Find more information about the Dataset class that can be created from this JSON file
# in :ref:`datasets_object`.

###############################################################################
# Example JSON Structure
# -----------------------------------------------------------------------------
print(json.dumps(data, indent=4))
Loading