Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Stim_annot #1

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

[ENH] Stim_annot #1

wants to merge 15 commits into from

Conversation

neuromechanist
Copy link
Owner

Following bids-standard#153, and the proposed workflow in the Google Doc under the issue, I inserted the suggestions here to provide an appaoch for simulus annotation within the stimuli directory.

I appreciate if you could review and edit as you see fit.

@neuromechanist
Copy link
Owner Author

@VisLab, @smakeig, @dorahermes, @arnodelorme, and @monique2208,
Following today's conversation, I invited you all to collaborate on this repo. Please accept the invitation so I can add you as reviewers.

Best,
Yahya

| ------ | -------- | -------- | --------------------------------- | ----------------------------------------------------------- |
| 25.033 | 1.02 | nsd03050 | {nsd03050, first_COCO_description} | Sensory-event, Visual-presentation, (Image, {nsd03050.png}) |

In this example, the `{nsd03050.jpg, first_COCO_description}` indicates to append or add the descriptions of the image under the *first_COCO_description* in the `stimuli.tsv` file to a column in the `events.tsv` file with the same name. Also, following the HED-specifications 3.2.0, the HED annotation `{nsd03050.jpg}` indicates inserting the HED annotation from the `HED` column in the `stimuli.tsv` file to the `events.tsv` file's `HED` column for that particular event row.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two questions.

First, would it make sense to simply require a description column in the stimuli.tsv file? Instead of adding the curly brackets mechanism, this description is just always taken from there. This would avoid adding additional syntax people have to learn.

Second, for inserting the HED annotations into the events file, does this only work if the HED annotations are in the column? I think it should work with the json HED annotations as well. Only I can think of two possible ways it could work, either the compiled HED annotation is taken to the event file based on reference to the file name, or we could simply extend the 'namespace' of columns and also include the column names in the stimulus files. Basically, once the reference to the stimulus is made, if there is column 'color' in the stimulus file we can use {color} in the HED tag. I can forsee use cases for this if particular properties of the image are relevant for a condition is a specific experiment. We might run into issues if column names are not unique, so this either needs to be specified or we need a prefix.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your fantastic comments. If I understood them correctly, it seems that the first and second comments are (in a way) opposing each other. In the first comment you suggest to design a predtermined column that would be automatically pulled from stimuli.tsv to events.tsv. But the second comment provide a great way to use HED curly brace method to pull multiple columns into the HED annotations of each row.

Please correct me if I am worng. I believe that the solution should be as inclusive as possible.

@VisLab
Copy link
Collaborator

VisLab commented Apr 9, 2024

I'm trying to find the part about putting curly braces in the stim_file column of the events.tsv file, but I can't find it to comment on. I don't think any curly braces are needed in the events file.

still images: If you allow a JSON sidecar for the stimuli.tsv with a HED column, tools can automatically assemble a complete annotation for the image and substitute it in if desired.

movies/audio: This is more complicated. I don't think these should ever be directly substituted into the event file. The events file might have a "movie-start" event. Tools can then create a separate timeline for the movie and merge if needed. This is tool specific and should not be part of BIDS. BIDS should provide the an infrastructure for enabling the annotation to be used by tools.

@neuromechanist
Copy link
Owner Author

I'm trying to find the part about putting curly braces in the stim_file column of the events.tsv file, but I can't find it to comment on. I don't think any curly braces are needed in the events file.

still images: If you allow a JSON sidecar for the stimuli.tsv with a HED column, tools can automatically assemble a complete annotation for the image and substitute it in if desired.

movies/audio: This is more complicated. I don't think these should ever be directly substituted into the event file. The events file might have a "movie-start" event. Tools can then create a separate timeline for the movie and merge if needed. This is tool specific and should not be part of BIDS. BIDS should provide the an infrastructure for enabling the annotation to be used by tools.

With the HED column and HED description in JSON, curly braces would not be needed (Line 192), similar to @monique2208's proposal. However, we also need to accommodate for non-HED event replacements. I wonder if there is any ways to avoid curly braces (or any other symbol) to determine non-HED columns.

Agree that the time-varying events should not be automatically substituted, yet the annotations should make enough room for the tools to disambiguously extend the events.tsv.

@VisLab
Copy link
Collaborator

VisLab commented Apr 9, 2024

With the HED column and HED description in JSON, curly braces would not be needed (Line 192), similar to @monique2208's proposal. However, we also need to accommodate for non-HED event replacements. I wonder if there is any ways to avoid curly braces (or any other symbol) to determine non-HED columns.

Agree that the time-varying events should not be automatically substituted, yet the annotations should make enough room for the tools to disambiguously extend the events.tsv.

Maybe I'm not following.... Which columns should be substituted for stim_file should not be part of BIDS. This is up to the analyst, and different people will want different (non-HED-annotated) columns. This can be easily be done with a remodeling operation or if a tool wants to do it, it can easily do the remapping. The responsibility of BIDS is to make the needed information available. The stimuli.tsv does that. The rest is just a simple lookup/substitute.

@neuromechanist
Copy link
Owner Author

Maybe I'm not following.... Which columns should be substituted for stim_file should not be part of BIDS. This is up to the analyst, and different people will want different (non-HED-annotated) columns. This can be easily be done with a remodeling operation or if a tool wants to do it, it can easily do the remapping. The responsibility of BIDS is to make the needed information available. The stimuli.tsv does that. The rest is just a simple lookup/substitute.

Understood. I think it makes sense. I will remove the curly braces argument.

For the sake of completeness, assume the annotation to be expanded in the above example is the first_COCO_description. Following your argument, there is no way to convey this intention, and each analyst can use their columns (This is perfectly fine, IMHO). However, if the first_COCO_description has a HED key in its JSON, then the HED description in the events.tsv can precisely be expanded to include the first_COCO_description. This is, what I think, the discrepancy between HED and non-HED annotations.

@VisLab
Copy link
Collaborator

VisLab commented Apr 10, 2024

It should be emphasized that using HED is not required for this to be useful. For example, the Wakeman-Hanson (openNeuro ds003645 -- originally ds000117) shows faces which they have classified as "famous", "familiar" and "scrambled".

We have discussed classifying each face by sex (male, female) and by expression (smile, frown, neutral). Some combination of these columns could be added to or replace the famous, familiar, and scrambled in the event file. If the data creators think that is a good idea, they could suggest this as a possibility in the dataset's readme.

People can analyze based just on the column values with no HED (and often do). This should be a tool-based decision. Having the information in columns of a stimuli.tsv with appropriate documentation of column meaning in the accompanying JSON is all that is needed from BIDS.

@VisLab
Copy link
Collaborator

VisLab commented Apr 11, 2024 via email

@neuromechanist
Copy link
Owner Author

Each stimulus.ts file corresponds to annotations for a particular "movie". There might be multiple different movies in the stimuli directory, so it has to know which one it pertains to. Just like an events.tsv file is associated with a particular recording, a stimulus.tsv file can only be associated with a particular movie. I would suggest that instead of stimulus.tsv,* these be named _annot.tsv*.

I might be misreading this but it seems here that the annot-*_stimulus.tsv file is not required to be associated with a specific file in the stimuli directory (for example "the-present_movie.mp4").

Yes, I agree with both suggestions. However, we also need to accommodate multiple annotations per movie. So, as a suggestion, we can have it as stimulus-file-name_trace-XXX_annot.tsv, which accommodates different annotation traces to be added to the directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants