[Discussion] Adding support for writing to nwb file. #10

h-mayorquin · 2022-08-15T18:38:41Z

h-mayorquin
Aug 15, 2022

Hi,
So we want to help you guys to add support to nwb. I am opening this issue to have some preliminary discussion with you so we can align on design issues (I already read your helpful contributing document).

I have spent the day looking at your code base here and in your main repo and so far my assesment is the following. I think that the task of offering nwb writing support can be divided into two big steps:

Extract all the data from the labeled frames / instances and organize it in a structure like labels.numpy which you use in your general repo to write to nwb.
Use the data in the aforementioned structure to write the pose estimation objects as you do in appropriate method of the ndx_pose module.

The latter (2) is sort of straighfroward for us as we have done this many times so I will concentrate on the former (1). I already built a prototype using a hierarchy of tidy data frames that reads all the data using the objects in your models module:

https://gist.github.com/h-mayorquin/2c20eb7c7dbb3849ce5e45bc4a8afc5d

Which produces data like this:

So, design questions that I have:

I think that the prototype method can be easily used to fulfill point 1 (I already tested it with some data) but I wanted to discuss with you if you had in mind another way of providing similar functionality. The reason for this is that I would rather help you to implement your design vision rather than using my ad-hoc method for this (I could not find anything equivalent on the library).
Where should the functionality to write to nwb should be located? Is the idea to create another module under the io directory called ndx_pose or nwb that could be used for this or do you have other ideas for the organization.
Concerning testing I am using the example that you provide in your main repo (the centered_pair_predictions.slp file) as is more complex . However, this file is not available in this repo, could we add it so we can have automatic testing with more complex files? The one currently available seem to only have two tracks and one labeled frame so some parts of the code would be harder to test.
I have a question about your data model. I don't understand the set theoretical relationships between Track and Skeleton (and maybe that's why the data above in the data-frame might look confused). An instance object contains both a skeleton and a track. However, is their relationship 1-to-1, can a track contain multiple skeletons? can the skeleton be assigned to multiple tracks (this makes less sense to me)?

If you are fine with the prototype above (question 1) I can move forward and implement it quickly so you can see it working on code. All the other questions are of less immediate importance I think.

P.D. Pandas a dependency is implied by the use of ndx_pose.

talmo · 2022-08-16T18:26:49Z

talmo
Aug 16, 2022
Maintainer

Hi @h-mayorquin,

This sounds great! Thanks for the help. Some answers below.

So, design questions that I have:

I think that the prototype method can be easily used to fulfill point 1 (I already tested it with some data) but I wanted to discuss with you if you had in mind another way of providing similar functionality. The reason for this is that I would rather help you to implement your design vision rather than using my ad-hoc method for this (I could not find anything equivalent on the library).
The idea with this library is to not need another intermediate representation and instead work directly off of the data model.

For example, why not just build up the pose series directly instead of coercing them into a dataframe first?

Numpy arrays may be a reasonable intermediate for things that strictly need to be series, i.e., for inference data. (See rly/ndx-pose#9 regarding training data.)

Where should the functionality to write to nwb should be located? Is the idea to create another module under the io directory called ndx_pose or nwb that could be used for this or do you have other ideas for the organization.

sleap_io/io/nwb.py sounds great!

Concerning testing I am using the example that you provide in your main repo (the centered_pair_predictions.slp file) as is more complex . However, this file is not available in this repo, could we add it so we can have automatic testing with more complex files? The one currently available seem to only have two tracks and one labeled frame so some parts of the code would be harder to test.

Yes, we can add some more fixtures that are more complex. We've been doing it on an as-needed basis to prevent the repo from getting too bloated.

I have a question about your data model. I don't understand the set theoretical relationships between Track and Skeleton (and maybe that's why the data above in the data-frame might look confused). An instance object contains both a skeleton and a track. However, is their relationship 1-to-1, can a track contain multiple skeletons? can the skeleton be assigned to multiple tracks (this makes less sense to me)?

A Skeleton just defines a set of nodes corresponding to landmark types (+ connections). It does not contain any positional data.

Instances have a skeleton associated with them to map positions to unique landmark types. An Instance can only have one Skeleton, but different Instances can have different Skeletons.

A Track describes a (semi-)unique identity that associates Instances across frames. It can be co-opted to describe a unique class like "female" or "black_mouse" across multiple videos, but typically refers to the same physical animal (e.g., "animal1") within the same video.

An Instance must have a Skeleton set, but does not need a Track. This is to support the case where we go to a random frame and cannot identify each animal uniquely, but can annotate their poses. Forcing Instances to be contained within a Track is highly inflexible and intractably increases the annotation burden for users in many use cases where it is non-trivial to identify animals uniquely in a random-access fashion.

For final inference results not used in labeling, we will typically want every Instance to have a Track assignment. (Though this is not strictly necessary in some cases.)

Tracks and Skeletons have no relationship to each other.

If you are fine with the prototype above (question 1) I can move forward and implement it quickly so you can see it working on code. All the other questions are of less immediate importance I think.

Maybe it's best that you just try it out and send a PR. It looks like it's almost done, so feel free to finish it off with whatever approach is easiest and we'll iterate from there.

P.D. Pandas a dependency is implied by the use of ndx_pose.

Pandas is no problem, thanks!

2 replies

h-mayorquin Aug 17, 2022
Author

For example, why not just build up the pose series directly instead of coercing them into a dataframe first?

Numpy arrays may be a reasonable intermediate for things that strictly need to be series, i.e., for inference data. (See rly/ndx-pose#9 regarding training data.)

To me the fundamental problem is that in this model the data is distributed across the LabeledFrame / Instances objects which contain their references to Video, Skeleton and Track. However, for writing to nwb I want the reverse association, that is, I want all the landmark points for a specific node, skeleton, track, video.

For a specific example consider writing a specific PoseEstimationSeries for a particular node. Here, I want all the data associated with that node, in that track, and in that video. If I do that for every PoseEstimationSeries then that implies multiple unnecessary passes through the same instance (to check for skeleton, track, video) which seems rather inefficient to me. So, my first instinct was to do only one pass through the most numerous object (the instances of all the labeled frames) and then organize everything hierarchically for writing. I saw that you are doing the same in Labels.numpy() in the main repo but over there you reference by position instead of by label which I think is harder to maintain and prone to error.

So, I am already at the point where I am using a hierarchy of dictionaries containing numpy arrays which is the data model of Pandas DataFrame so I used it for myself to take a look at the extracted data and as a temporary solution. Then I did some re-organization of the data frame so it can be accessed with a dict-like notation (put the index as columns).

So that's the reasoning behind this choice and how I got here. What do you think? Maybe I am missing something?

talmo Aug 18, 2022
Maintainer

Hi @h-mayorquin,

This makes sense and I think it should work fine to build it out from the dataframe for PoseEstimationSeries creation.

h-mayorquin · 2022-08-17T10:40:01Z

h-mayorquin
Aug 17, 2022
Author

@talmo
Concerning the data model. Thanks for the explanation. My confusion is the following. Let's consider the case of an slp file with a single video and all instances tracked case to keep this simple. For writing to nwb, the main repo (sleap) builds a PoseEstimation with all the instances that share a common track object. That is, every PoseEstimation object is associated to a specific track. Now, as you mentioned, it happens that every instance has an Skeleton as well. Which means that in this example, every instance is associated to both a Track and a Skeleton object. Now, I am wondering if all the cases illustrated in the following diagram can occur:

It seems to me that the only way that the current organization of the nwbfile that you have in the main repo works is if only A can occur. In fact, your current implementation assumes that there is only one skeleton in the whole labels object (so all the instances share the same skeleton):
https://github.com/talmolab/sleap/blob/73e3db02a98d741f3efc9f3acdd12b17d4d88ba7/sleap/io/format/ndx_pose.py#L262

Are your files working with only skeleton usually? Can the examples B or C in the diagram occur?

If so, what should we do? the PoseEstimation container naturally maps to one skeleton object as it has edges and nodes as arguments. For C, it appears to me that the intent of your function was to build a PoseEstimation container for each track which makes sense to me. I am less sure about the B case but that one does not make a lot of sense to me (why have two skeletons to mapping to a single animal for example)?

Related is the issue in:
rly/ndx-pose#3

As a reminder for myself, the organization that we have in the nwb file is:

For every video we have a processing module.
We attach a PoseEstimation container to the processing module for every track.
Every PoseEstimation object contains a list of PoseEstimationSeries each corresponding to the trajectory of node skeleton nodes.

1 reply

talmo Aug 18, 2022
Maintainer

Hi @h-mayorquin,

Both A and C can occur. In fact, A is trivial as it essentially only allows for single animal tracks.

C is the most common scenario. For example, you'll have two mice (assigned track I and track II) with the same skeleton, but with Instances in multiple frames.

Scenario B is not expected and we shouldn't plan to support. We have plans to support multi-skeleton scenarios, but not for a while, and regardless they would not be associated with the same track.

h-mayorquin · 2022-09-12T13:13:22Z

h-mayorquin
Sep 12, 2022
Author

Hi, so I am adding a second set of improvements related to metadata propagation (See #15). For this, it would also be good to have a file that is up to data on its provenance (see #14) so we can discuss how to better add the information available. The design idea that I have is that by default sleap-io would extract as much information from the lables object as possible but a dictionary can be passed to update those default values if wanted or if missing information exists.

I would really like to have an up-to date dataset that you consider a good example for provenance information. I am wondering about the dataset example that you have right now in the repo (centered_pair_predictions.slp). Why does it have many tracks? When I looked at the video it seems to only have two flies, correct? that lead me to think that it only should have two tracks but in fact has 27. I guess this is simulating multiple post-inference tracking?

Another small question that I have is: what is the reference frame for your trajectories? That is, I think that you get x and y positions for the coordinates of the nodes (in pixels) but are they referenced to which corner? bottom-left (0, 0), top-left (1,0)? which one?

2 replies

talmo Sep 13, 2022
Maintainer

Hi, so I am adding a second set of improvements related to metadata propagation (See #15). > For this, it would also be good to have a file that is up to data on its provenance (see #14) so we can discuss how to better add the information available. The design idea that I have is that by default sleap-io would extract as much information from the lables object as possible but a dictionary can be passed to update those default values if wanted or if missing information exists.

Thanks for putting together #14!

Sure thing, definitely open to something like that. What types of metadata should we be looking to extract specifically?

For reference, this is where we fill in the provenance in our labels: sleap/nn/inference.py#L4659-L4715

I would really like to have an up-to date dataset that you consider a good example for provenance information. I am wondering about the dataset example that you have right now in the repo (centered_pair_predictions.slp). Why does it have many tracks? When I looked at the video it seems to only have two flies, correct? that lead me to think that it only should have two tracks but in fact has 27. I guess this is simulating multiple post-inference tracking?

This has a bunch of extraneous tracks -- all tracks past the first two have no instances associated with them. We should probably clean that up.

Another small question that I have is: what is the reference frame for your trajectories? That is, I think that you get x and y positions for the coordinates of the nodes (in pixels) but are they referenced to which corner? bottom-left (0, 0), top-left (1,0)? which one?

The coordinates are in (x, y) relative to the top-left of the image.

The convention is that the coordinates refer to the midpoint of the pixel. This means that the midpoint of the top-left pixel is at (0, 0), and the top-left corner of that same pixel is at (-0.5, -0.5).

Adapting from here, for a single row image of shape (1, 4) with values: [[a, b, c, d]], the coordinates can be visualized in the diagram below:

                                            Y:
      ┌───────┬───────┬───────┬───────┐  ← -0.5
      │   a   │   b   │   c   │   d   │  ←  0
      └───────┴───────┴───────┴───────┘  ←  0.5
      ↑   ↑   ↑   ↑   ↑   ↑   ↑   ↑   ↑
X:  -0.5  0  0.5  1  1.5  2  2.5  3  3.5

h-mayorquin Sep 14, 2022
Author

The coordinates are in (x, y) relative to the top-left of the image.

Thanks, I added this information now on #17.

Concerning the provenance data I was checking and I think the latest fixture I added on #17 has all the data that could we possible require and more. The only thing I feel is really missing is video related information. None of the fixtures have data for the videos. For all of them the fields shape and back-end are empty (even the ones I generated with the latest version):

labels.videos
[Video(filename='/home/heberto/Downloads/20190128_113421.mp4', shape=None, backend=None)]

While the backend seems present in the videos_json field in the slp file when I inspected it, the shape is not available there. I guess is just not added at creation time (I guess this would be added on inference)? Where is this done?

Along those lines, something I think would be very useful is to store the timestamps of the video in the file. This is important because then the frames can be synchronized with data from other modalities using the timestamps (provided we know when each recording started). Moreover, with shape and timestamps, the trajectories become more useful information in the absence of the actual video (as we know how things happen in time and space and not only form).

I will do a small PR to add the backend information that seems to no be extracted now.

h-mayorquin · 2022-09-14T08:27:40Z

h-mayorquin
Sep 14, 2022
Author

I added a small PR to extract backend information as its is, a dict: #18.
Is that what you intended for this attribute of the video object? The docstring seems unclear to me:

backend: An object that implements the basic methods for reading and manipulating frames of a specific video type.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Adding support for writing to nwb file. #10

{{title}}

Replies: 4 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

[Discussion] Adding support for writing to nwb file. #10

h-mayorquin Aug 15, 2022

Replies: 4 comments · 5 replies

talmo Aug 16, 2022 Maintainer

h-mayorquin Aug 17, 2022 Author

talmo Aug 18, 2022 Maintainer

h-mayorquin Aug 17, 2022 Author

talmo Aug 18, 2022 Maintainer

h-mayorquin Sep 12, 2022 Author

talmo Sep 13, 2022 Maintainer

h-mayorquin Sep 14, 2022 Author

h-mayorquin Sep 14, 2022 Author

h-mayorquin
Aug 15, 2022

Replies: 4 comments 5 replies

talmo
Aug 16, 2022
Maintainer

h-mayorquin Aug 17, 2022
Author

talmo Aug 18, 2022
Maintainer

h-mayorquin
Aug 17, 2022
Author

talmo Aug 18, 2022
Maintainer

h-mayorquin
Sep 12, 2022
Author

talmo Sep 13, 2022
Maintainer

h-mayorquin Sep 14, 2022
Author

h-mayorquin
Sep 14, 2022
Author