Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for embedding images in .pkg.slp #91

Merged
merged 30 commits into from
Jun 5, 2024
Merged

Conversation

talmo
Copy link
Contributor

@talmo talmo commented May 5, 2024

Highlights

  • Saving SLP files with embedded images will re-save the embedded images.
  • Embed images into SLP files with: labels.save("labels.pkg.slp", embed="user")

Summary by CodeRabbit

  • New Features

    • Introduced functions for video manipulation and embedding in SLEAP labels files.
    • Added properties to LabeledFrame for checking specific instance types.
    • Enabled embedding frames in saved files with the save_slp function.
  • Enhancements

    • Updated Video class with new attributes and methods for better metadata handling.
    • Improved image shape handling in HDF5Video class.
    • Added user_labeled_frames property to return frames with user instances.
  • Bug Fixes

    • Fixed issues with video object creation and frame embedding.
  • Documentation

    • Enhanced CONTRIBUTING.md with live coverage setup steps.
  • Chores

    • Updated .gitignore to include lcov.info.
    • Adjusted pyproject.toml for dependencies and testing configurations.

Copy link
Contributor

coderabbitai bot commented May 5, 2024

Warning

Rate limit exceeded

@talmo has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 44 minutes and 3 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 5eee083 and d2a1415.

Walkthrough

The updates introduce significant improvements and enhancements across multiple files in the sleap_io project, focusing on better video handling, embedding capabilities, and more robust testing. Key changes include direct video object usage, new video manipulation functions, enhanced properties for labeled frames, and improved test coverage. Additionally, the project setup and contributing guidelines have been refined to facilitate better development practices and live coverage setup.

Changes

File(s) Change Summary
sleap_io/io/jabs.py Updated LabeledFrame creation to pass a Video object directly.
sleap_io/io/slp.py Added video manipulation functions, updated read_videos, and enhanced write_labels for embedding.
sleap_io/io/video.py Enhanced HDF5Video class with image format handling and updated properties and methods.
sleap_io/model/video.py Modified Video class with new attributes and methods for metadata handling and initialization.
sleap_io/model/labeled_frame.py Added properties to check for user and predicted instances; updated class definition.
sleap_io/model/labels.py Added user_labeled_frames property to return frames with user instances.
sleap_io/io/main.py Updated save_slp function to include an embed parameter for frame embedding options.
tests/io/test_main.py Added assertions for video length comparisons in tests.
tests/io/test_slp.py Added new test functions and enhanced video comparison logic.
tests/model/test_labeled_frame.py Added assertions and new test methods for LabeledFrame instances.
tests/model/test_video.py Added assertions and modified tests for video object states and backend types.
.gitignore Added lcov.info to ignore list.
CONTRIBUTING.md Added steps for live coverage setup and detailed development guidelines.
pyproject.toml Updated author details, dependencies, and testing configurations.

Sequence Diagram(s) (Beta)

sequenceDiagram
    participant User
    participant SLEAP
    participant Video
    participant LabeledFrame

    User->>SLEAP: Call read_labels()
    SLEAP->>Video: Create Video object
    SLEAP->>LabeledFrame: Pass Video object
    LabeledFrame->>User: Return LabeledFrame instance

    User->>SLEAP: Call save_slp(embed="user")
    SLEAP->>Video: Embed user frames
    Video->>SLEAP: Return embedded frames
    SLEAP->>User: Save file with embedded frames
Loading

Poem

In code we weave a tale so grand,
With videos held in steady hand.
Frames now embed with ease and grace,
Tests ensure no bugs we face.
Developers cheer, their paths made clear,
For in this code, there's naught to fear.
🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@talmo talmo linked an issue May 5, 2024 that may be closed by this pull request
Copy link

codecov bot commented May 5, 2024

Codecov Report

Attention: Patch coverage is 90.57592% with 18 lines in your changes missing coverage. Please review.

Project coverage is 95.78%. Comparing base (4af9f38) to head (d2a1415).

Files Patch % Lines
sleap_io/io/slp.py 88.00% 15 Missing ⚠️
sleap_io/model/video.py 90.47% 2 Missing ⚠️
sleap_io/io/video.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #91      +/-   ##
==========================================
+ Coverage   95.71%   95.78%   +0.06%     
==========================================
  Files          15       17       +2     
  Lines        1704     1874     +170     
==========================================
+ Hits         1631     1795     +164     
- Misses         73       79       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@talmo talmo marked this pull request as ready for review May 5, 2024 06:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Out of diff range and nitpick comments (1)
sleap_io/io/slp.py (1)

30-31: Consider using a more specific import statement for imageio.v3.

Using import imageio.v3 as iio imports all submodules and functions under imageio.v3, which might not be necessary. Consider importing only the required functions to improve clarity and potentially reduce memory usage.

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between e553986 and 31341da.
Files selected for processing (4)
  • sleap_io/io/slp.py (4 hunks)
  • sleap_io/io/video.py (9 hunks)
  • sleap_io/model/video.py (7 hunks)
  • tests/io/test_slp.py (2 hunks)
Additional Context Used
Ruff (18)
sleap_io/io/video.py (4)

22-22: imageio_ffmpeg imported but unused; consider using importlib.util.find_spec to test for availability


27-27: av imported but unused; consider using importlib.util.find_spec to test for availability


87-87: Do not compare types, use isinstance()


90-90: Do not compare types, use isinstance()

sleap_io/model/video.py (2)

9-9: Redefinition of unused Optional from line 9


102-102: Do not use bare except

tests/io/test_slp.py (12)

6-6: sleap_io.Edge imported but unused


9-9: sleap_io.LabeledFrame imported but unused


10-10: sleap_io.Track imported but unused


11-11: sleap_io.Point imported but unused


12-12: sleap_io.PredictedPoint imported but unused


24-24: sleap_io.io.slp.serialize_skeletons imported but unused


28-28: Redefinition of unused read_instances from line 21


28-28: sleap_io.io.slp.read_instances imported but unused


33-33: sleap_io.io.utils.read_hdf5_dataset imported but unused


35-35: simplejson imported but unused


56-56: Do not compare types, use isinstance()


190-190: Do not compare types, use isinstance()

GitHub Check Runs (1)
codecov/patch failure (13)

sleap_io/io/slp.py: [warning] 35-36: sleap_io/io/slp.py#L35-L36
Added lines #L35 - L36 were not covered by tests


sleap_io/io/slp.py: [warning] 73-73: sleap_io/io/slp.py#L73
Added line #L73 was not covered by tests


sleap_io/io/slp.py: [warning] 105-106: sleap_io/io/slp.py#L105-L106
Added lines #L105 - L106 were not covered by tests


sleap_io/io/slp.py: [warning] 145-145: sleap_io/io/slp.py#L145
Added line #L145 was not covered by tests


sleap_io/io/slp.py: [warning] 221-221: sleap_io/io/slp.py#L221
Added line #L221 was not covered by tests


sleap_io/io/slp.py: [warning] 228-228: sleap_io/io/slp.py#L228
Added line #L228 was not covered by tests


sleap_io/io/slp.py: [warning] 240-240: sleap_io/io/slp.py#L240
Added line #L240 was not covered by tests


sleap_io/io/slp.py: [warning] 254-254: sleap_io/io/slp.py#L254
Added line #L254 was not covered by tests


sleap_io/io/slp.py: [warning] 259-260: sleap_io/io/slp.py#L259-L260
Added lines #L259 - L260 were not covered by tests


sleap_io/io/slp.py: [warning] 279-279: sleap_io/io/slp.py#L279
Added line #L279 was not covered by tests


sleap_io/io/video.py: [warning] 606-606: sleap_io/io/video.py#L606
Added line #L606 was not covered by tests


sleap_io/model/video.py: [warning] 223-223: sleap_io/model/video.py#L223
Added line #L223 was not covered by tests


sleap_io/model/video.py: [warning] 225-225: sleap_io/model/video.py#L225
Added line #L225 was not covered by tests

Additional comments not posted (8)
tests/io/test_slp.py (2)

104-109: The addition of the compare_videos function enhances the robustness of video comparison in tests.


110-123: Ensure the compare_videos function is correctly integrated and used in all relevant test cases.

Verification successful

The compare_videos function is correctly integrated and used in all relevant test cases within the test_slp.py file. The function is both defined and called in multiple test scenarios as expected.

  • tests/io/test_slp.py: Definition and multiple usages of compare_videos are confirmed.
Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the usage of `compare_videos` function in test cases.

# Test: Search for the function usage. Expect: Occurrences in all relevant test cases.
rg --type python $'compare_videos'

Length of output: 64



Script:

#!/bin/bash
# Description: Verify the usage of `compare_videos` function in test cases.

# Test: Search for the function usage. Expect: Occurrences in all relevant test cases.
rg --type py 'compare_videos'

Length of output: 290

sleap_io/model/video.py (2)

Line range hint 15-41: The refactoring of the Video class with new attributes enhances its functionality and maintainability.


Line range hint 52-82: Ensure the new source_video attribute is correctly integrated and used in the from_filename method.

sleap_io/io/video.py (3)

Line range hint 482-494: The addition of the image_format attribute in the HDF5Video class supports enhanced handling of embedded images.


Line range hint 586-606: Ensure the decode_embedded method correctly handles the decoding of embedded images using the new image_format attribute.

Verification successful

The verification process has confirmed that the decode_embedded method is indeed being used within the sleap_io/io/video.py file. The method is called directly and also within a list comprehension for processing multiple images. This indicates active usage of the method in the context of handling embedded images.

However, the original review comment was concerned with ensuring that the method correctly handles the decoding of embedded images using the new image_format attribute. The provided code snippet shows that the method uses cv2.imdecode when OpenCV is available, and falls back to imageio.imread otherwise, with the format being dynamically determined by the image_format attribute.

Given this setup, the method appears to be designed to handle different image formats dynamically, which aligns with the requirements mentioned in the review comment. Therefore, it seems that the method is implemented correctly concerning the handling of different image formats based on the image_format attribute.

Conclusion:
The decode_embedded method in sleap_io/io/video.py is implemented to dynamically handle different image formats based on the image_format attribute, and it is actively used within the same file. This aligns with the expectations set in the review comment.

  • sleap_io/io/video.py: The method is used directly and in a list comprehension for processing multiple images.
Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the correct handling of embedded images in the `decode_embedded` method.

# Test: Search for the method usage. Expect: Correct handling of different image formats.
rg --type python $'decode_embedded'

Length of output: 65



Script:

#!/bin/bash
# Description: Verify the correct handling of embedded images in the `decode_embedded` method.

# Test: Search for the method usage. Expect: Correct handling of different image formats.
rg --type py 'decode_embedded'

Length of output: 267


163-164: The img_shape property in the MediaVideo class correctly calculates the shape of a single frame.

sleap_io/io/slp.py (1)

Line range hint 115-132: Ensure comprehensive test coverage for read_videos.

The read_videos function has been modified and it's crucial to ensure that all new logic paths are covered by unit tests to prevent regressions.

sleap_io/io/slp.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 31341da and 696ae60.
Files selected for processing (1)
  • sleap_io/io/slp.py (4 hunks)
Additional comments not posted (3)
sleap_io/io/slp.py (3)

Line range hint 115-132: LGTM! The function read_videos correctly processes video datasets from a SLEAP labels file.


306-349: LGTM! The function embed_frames correctly handles the embedding of frames and updates the Videos and Labels objects appropriately.


Line range hint 349-377: LGTM! The function write_videos correctly handles the checking for already embedded videos and writes video metadata appropriately.

Comment on lines 192 to 303

if image_format == "hdf5":
img_data = frame
else:
if "cv2" in sys.modules:
img_data = np.squeeze(
cv2.imencode("." + image_format, frame)[1]
).astype("int8")
else:
img_data = np.frombuffer(
iio.imwrite(
"<bytes>", frame.squeeze(axis=-1), extension="." + image_format
),
dtype="int8",
)

imgs_data.append(img_data)

# Write the image data to the labels file.
with h5py.File(labels_path, "a") as f:
if image_format == "hdf5":
f.create_dataset(
f"{group}/video", data=imgs_data, compression="gzip", chunks=True
)
else:
if fixed_length:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data), max(len(img) for img in imgs_data)),
dtype="int8",
compression="gzip",
)
for i, img in enumerate(imgs_data):
ds[i, : len(img)] = img
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass

video_path = video_path.as_posix()

if "filenames" in backend:
# This is an ImageVideo.
# TODO: Path resolution.
video_path = backend["filenames"]

try:
backend = VideoBackend.from_filename(
video_path,
dataset=backend.get("dataset", None),
grayscale=backend.get("grayscale", None),
input_format=backend.get("input_format", None),
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data),),
dtype=h5py.special_dtype(vlen=np.dtype("int8")),
)
for i, img in enumerate(imgs_data):
ds[i] = img

# Store metadata.
ds.attrs["format"] = image_format
(
ds.attrs["frames"],
ds.attrs["height"],
ds.attrs["width"],
ds.attrs["channels"],
) = video.shape

# Store frame indices.
f.create_dataset(f"{group}/frame_numbers", data=frame_inds)

# Store source video.
if video.source_video is not None:
# If this is already an embedded dataset, retain the previous source video.
source_video = video.source_video
embedded_video = video
else:
source_video = video
embedded_video = Video(
filename=labels_path,
backend=VideoBackend.from_filename(
labels_path,
dataset=f"{group}/video",
grayscale=video.grayscale,
keep_open=False,
),
source_video=source_video,
)
except ValueError:
backend = None
video_objects.append(Video(filename=video_path, backend=backend))
return video_objects

grp = f.require_group(f"{group}/source_video")
grp.attrs["json"] = json.dumps(
video_to_dict(source_video), separators=(",", ":")
)

return embedded_video
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor the embed_video function to improve modularity and error handling.

Consider separating the concerns of image data loading, encoding, and writing to the file into different functions. Additionally, improve error handling for image format validation and HDF5 operations to make the function more robust.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 696ae60 and fd9bae9.
Files selected for processing (1)
  • sleap_io/io/slp.py (4 hunks)

Comment on lines 192 to 302
frame = video[frame_idx]

if image_format == "hdf5":
img_data = frame
else:
if "cv2" in sys.modules:
img_data = np.squeeze(
cv2.imencode("." + image_format, frame)[1]
).astype("int8")
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass

video_path = video_path.as_posix()

if "filenames" in backend:
# This is an ImageVideo.
# TODO: Path resolution.
video_path = backend["filenames"]

try:
backend = VideoBackend.from_filename(
video_path,
dataset=backend.get("dataset", None),
grayscale=backend.get("grayscale", None),
input_format=backend.get("input_format", None),
img_data = np.frombuffer(
iio.imwrite(
"<bytes>", frame.squeeze(axis=-1), extension="." + image_format
),
dtype="int8",
)

imgs_data.append(img_data)

# Write the image data to the labels file.
with h5py.File(labels_path, "a") as f:
if image_format == "hdf5":
f.create_dataset(
f"{group}/video", data=imgs_data, compression="gzip", chunks=True
)
except ValueError:
backend = None
video_objects.append(Video(filename=video_path, backend=backend))
return video_objects
else:
if fixed_length:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data), max(len(img) for img in imgs_data)),
dtype="int8",
compression="gzip",
)
for i, img in enumerate(imgs_data):
ds[i, : len(img)] = img
else:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data),),
dtype=h5py.special_dtype(vlen=np.dtype("int8")),
)
for i, img in enumerate(imgs_data):
ds[i] = img

# Store metadata.
ds.attrs["format"] = image_format
(
ds.attrs["frames"],
ds.attrs["height"],
ds.attrs["width"],
ds.attrs["channels"],
) = video.shape

# Store frame indices.
f.create_dataset(f"{group}/frame_numbers", data=frame_inds)

# Store source video.
if video.source_video is not None:
# If this is already an embedded dataset, retain the previous source video.
source_video = video.source_video
embedded_video = video
else:
source_video = video
embedded_video = Video(
filename=labels_path,
backend=VideoBackend.from_filename(
labels_path,
dataset=f"{group}/video",
grayscale=video.grayscale,
keep_open=False,
),
source_video=source_video,
)

grp = f.require_group(f"{group}/source_video")
grp.attrs["json"] = json.dumps(
video_to_dict(source_video), separators=(",", ":")
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor embed_video function to improve modularity and error handling.

Consider separating the concerns of image data loading, encoding, and writing to the file into different functions. Additionally, improve error handling for image format validation and HDF5 operations to make the function more robust.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between fd9bae9 and b41aa38.
Files selected for processing (1)
  • sleap_io/io/slp.py (5 hunks)
Additional Context Used
GitHub Check Runs (1)
codecov/patch failure (22)

sleap_io/io/slp.py: [warning] 35-36: sleap_io/io/slp.py#L35-L36
Added lines #L35 - L36 were not covered by tests


sleap_io/io/slp.py: [warning] 73-73: sleap_io/io/slp.py#L73
Added line #L73 was not covered by tests


sleap_io/io/slp.py: [warning] 105-106: sleap_io/io/slp.py#L105-L106
Added lines #L105 - L106 were not covered by tests


sleap_io/io/slp.py: [warning] 145-145: sleap_io/io/slp.py#L145
Added line #L145 was not covered by tests


sleap_io/io/slp.py: [warning] 227-227: sleap_io/io/slp.py#L227
Added line #L227 was not covered by tests


sleap_io/io/slp.py: [warning] 234-234: sleap_io/io/slp.py#L234
Added line #L234 was not covered by tests


sleap_io/io/slp.py: [warning] 246-246: sleap_io/io/slp.py#L246
Added line #L246 was not covered by tests


sleap_io/io/slp.py: [warning] 260-260: sleap_io/io/slp.py#L260
Added line #L260 was not covered by tests


sleap_io/io/slp.py: [warning] 265-266: sleap_io/io/slp.py#L265-L266
Added lines #L265 - L266 were not covered by tests


sleap_io/io/slp.py: [warning] 286-287: sleap_io/io/slp.py#L286-L287
Added lines #L286 - L287 were not covered by tests


sleap_io/io/slp.py: [warning] 326-330: sleap_io/io/slp.py#L326-L330
Added lines #L326 - L330 were not covered by tests


sleap_io/io/slp.py: [warning] 332-335: sleap_io/io/slp.py#L332-L335
Added lines #L332 - L335 were not covered by tests


sleap_io/io/slp.py: [warning] 343-344: sleap_io/io/slp.py#L343-L344
Added lines #L343 - L344 were not covered by tests


sleap_io/io/slp.py: [warning] 347-349: sleap_io/io/slp.py#L347-L349
Added lines #L347 - L349 were not covered by tests


sleap_io/io/slp.py: [warning] 364-372: sleap_io/io/slp.py#L364-L372
Added lines #L364 - L372 were not covered by tests


sleap_io/io/slp.py: [warning] 374-374: sleap_io/io/slp.py#L374
Added line #L374 was not covered by tests


sleap_io/io/slp.py: [warning] 376-376: sleap_io/io/slp.py#L376
Added line #L376 was not covered by tests


sleap_io/io/slp.py: [warning] 394-395: sleap_io/io/slp.py#L394-L395
Added lines #L394 - L395 were not covered by tests


sleap_io/io/slp.py: [warning] 1026-1026: sleap_io/io/slp.py#L1026
Added line #L1026 was not covered by tests

Comment on lines 192 to 303

if image_format == "hdf5":
img_data = frame
else:
if "cv2" in sys.modules:
img_data = np.squeeze(
cv2.imencode("." + image_format, frame)[1]
).astype("int8")
else:
img_data = np.frombuffer(
iio.imwrite(
"<bytes>", frame.squeeze(axis=-1), extension="." + image_format
),
dtype="int8",
)

imgs_data.append(img_data)

# Write the image data to the labels file.
with h5py.File(labels_path, "a") as f:
if image_format == "hdf5":
f.create_dataset(
f"{group}/video", data=imgs_data, compression="gzip", chunks=True
)
else:
if fixed_length:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data), max(len(img) for img in imgs_data)),
dtype="int8",
compression="gzip",
)
for i, img in enumerate(imgs_data):
ds[i, : len(img)] = img
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass

video_path = video_path.as_posix()

if "filenames" in backend:
# This is an ImageVideo.
# TODO: Path resolution.
video_path = backend["filenames"]

try:
backend = VideoBackend.from_filename(
video_path,
dataset=backend.get("dataset", None),
grayscale=backend.get("grayscale", None),
input_format=backend.get("input_format", None),
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data),),
dtype=h5py.special_dtype(vlen=np.dtype("int8")),
)
for i, img in enumerate(imgs_data):
ds[i] = img

# Store metadata.
ds.attrs["format"] = image_format
(
ds.attrs["frames"],
ds.attrs["height"],
ds.attrs["width"],
ds.attrs["channels"],
) = video.shape

# Store frame indices.
f.create_dataset(f"{group}/frame_numbers", data=frame_inds)

# Store source video.
if video.source_video is not None:
# If this is already an embedded dataset, retain the previous source video.
source_video = video.source_video
embedded_video = video
else:
source_video = video
embedded_video = Video(
filename=labels_path,
backend=VideoBackend.from_filename(
labels_path,
dataset=f"{group}/video",
grayscale=video.grayscale,
keep_open=False,
),
source_video=source_video,
)
except ValueError:
backend = None
video_objects.append(Video(filename=video_path, backend=backend))
return video_objects

grp = f.require_group(f"{group}/source_video")
grp.attrs["json"] = json.dumps(
video_to_dict(source_video), separators=(",", ":")
)

return embedded_video
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor the embed_video function to improve modularity and error handling.

The embed_video function is complex and handles multiple tasks. Consider refactoring it to separate concerns, such as image data loading, encoding, and writing to the file into different functions. Additionally, improve error handling for image format validation and HDF5 operations to make the function more robust.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between b41aa38 and b71646f.
Files selected for processing (8)
  • sleap_io/io/jabs.py (2 hunks)
  • sleap_io/io/slp.py (5 hunks)
  • sleap_io/io/video.py (9 hunks)
  • sleap_io/model/labeled_frame.py (2 hunks)
  • sleap_io/model/labels.py (1 hunks)
  • sleap_io/model/video.py (7 hunks)
  • tests/io/test_main.py (2 hunks)
  • tests/io/test_slp.py (3 hunks)
Files not reviewed due to errors (3)
  • sleap_io/model/video.py (no review received)
  • tests/io/test_slp.py (no review received)
  • sleap_io/model/labels.py (no review received)
Additional Context Used
Ruff (22)
sleap_io/io/video.py (4)

22-22: imageio_ffmpeg imported but unused; consider using importlib.util.find_spec to test for availability


27-27: av imported but unused; consider using importlib.util.find_spec to test for availability


87-87: Do not compare types, use isinstance()


90-90: Do not compare types, use isinstance()

sleap_io/model/labels.py (4)

17-17: sleap_io.Instance imported but unused


73-73: Do not compare types, use isinstance()


77-77: Do not compare types, use isinstance()


158-158: Do not compare types, use isinstance()

sleap_io/model/video.py (2)

9-9: Redefinition of unused Optional from line 9


107-107: Do not use bare except

tests/io/test_slp.py (12)

6-6: sleap_io.Edge imported but unused


9-9: sleap_io.LabeledFrame imported but unused


10-10: sleap_io.Track imported but unused


11-11: sleap_io.Point imported but unused


12-12: sleap_io.PredictedPoint imported but unused


24-24: sleap_io.io.slp.serialize_skeletons imported but unused


28-28: Redefinition of unused read_instances from line 21


28-28: sleap_io.io.slp.read_instances imported but unused


33-33: sleap_io.io.utils.read_hdf5_dataset imported but unused


35-35: simplejson imported but unused


57-57: Do not compare types, use isinstance()


191-191: Do not compare types, use isinstance()

GitHub Check Runs (1)
codecov/patch failure (15)

sleap_io/io/slp.py: [warning] 35-36: sleap_io/io/slp.py#L35-L36
Added lines #L35 - L36 were not covered by tests


sleap_io/io/slp.py: [warning] 73-73: sleap_io/io/slp.py#L73
Added line #L73 was not covered by tests


sleap_io/io/slp.py: [warning] 105-106: sleap_io/io/slp.py#L105-L106
Added lines #L105 - L106 were not covered by tests


sleap_io/io/slp.py: [warning] 145-145: sleap_io/io/slp.py#L145
Added line #L145 was not covered by tests


sleap_io/io/slp.py: [warning] 227-227: sleap_io/io/slp.py#L227
Added line #L227 was not covered by tests


sleap_io/io/slp.py: [warning] 234-234: sleap_io/io/slp.py#L234
Added line #L234 was not covered by tests


sleap_io/io/slp.py: [warning] 246-246: sleap_io/io/slp.py#L246
Added line #L246 was not covered by tests


sleap_io/io/slp.py: [warning] 260-260: sleap_io/io/slp.py#L260
Added line #L260 was not covered by tests


sleap_io/io/slp.py: [warning] 265-266: sleap_io/io/slp.py#L265-L266
Added lines #L265 - L266 were not covered by tests


sleap_io/io/slp.py: [warning] 377-378: sleap_io/io/slp.py#L377-L378
Added lines #L377 - L378 were not covered by tests


sleap_io/io/slp.py: [warning] 380-380: sleap_io/io/slp.py#L380
Added line #L380 was not covered by tests


sleap_io/io/video.py: [warning] 611-611: sleap_io/io/video.py#L611
Added line #L611 was not covered by tests


sleap_io/model/labeled_frame.py: [warning] 66-69: sleap_io/model/labeled_frame.py#L66-L69
Added lines #L66 - L69 were not covered by tests


sleap_io/model/video.py: [warning] 228-228: sleap_io/model/video.py#L228
Added line #L228 was not covered by tests


sleap_io/model/video.py: [warning] 230-230: sleap_io/model/video.py#L230
Added line #L230 was not covered by tests

Additional comments not posted (6)
tests/io/test_main.py (1)

Line range hint 58-70: The added assertions for video counts in test_jabs enhance the test coverage and ensure the integrity of video data during load/save operations.

sleap_io/model/labeled_frame.py (1)

50-57: The implementation of has_user_instances property is correct and efficiently checks for the presence of user-labeled instances.

sleap_io/io/jabs.py (1)

170-170: Direct use of video object in LabeledFrame instantiation simplifies the code.

Verification successful

Direct use of video object in LabeledFrame instantiation simplifies the code.

  • sleap_io/io/jabs.py:

    • Line 170: frame_label = LabeledFrame(video, frame_idx, instances)
  • sleap_io/io/nwb.py:

    • Line 140: LabeledFrame(video=video, frame_idx=frame_idx, instances=insts)
  • sleap_io/io/labelstudio.py:

    • Line 290: return LabeledFrame(video, frame_idx, instances)
  • tests/model/test_labels.py:

    • Lines 81, 127: LabeledFrame(video=labels.video, frame_idx=1)
Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify `LabeledFrame` instances are correctly created with the new approach.

# Test: Search for the function usage. Expect: Only occurances with a `Video` object.
rg --type python $'LabeledFrame'

Length of output: 62



Script:

#!/bin/bash
# Description: Verify `LabeledFrame` instances are correctly created with the new approach.

# Test: Search for the function usage. Expect: Only occurrences with a `Video` object.
rg 'LabeledFrame'

Length of output: 4200



Script:

#!/bin/bash
# Description: Extract relevant lines to verify `LabeledFrame` instances are correctly created with the `video` object.

# Extract lines from sleap_io/io/jabs.py
rg --line-number 'LabeledFrame' sleap_io/io/jabs.py

# Extract lines from sleap_io/io/nwb.py
rg --line-number 'LabeledFrame' sleap_io/io/nwb.py

# Extract lines from sleap_io/io/labelstudio.py
rg --line-number 'LabeledFrame' sleap_io/io/labelstudio.py

# Extract lines from tests/model/test_labels.py
rg --line-number 'LabeledFrame' tests/model/test_labels.py

Length of output: 1043

sleap_io/io/video.py (3)

163-164: Modification to img_shape to return three separate integers enhances clarity and usability.

Verification successful

The extracted snippets show that the img_shape property is used in several places within sleap_io/io/video.py. Each instance appears to handle the three integers (height, width, channels) correctly:

  • The img_shape property itself correctly returns three integers.
  • The shape property uses self.img_shape to form a tuple with num_frames.
  • The img_shape property in another class also correctly returns three integers after potentially reversing the order based on input_format.

These usages indicate that the changes to img_shape have been properly integrated.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all usages of `img_shape` handle three separate integers.

# Test: Search for the property usage. Expect: Proper handling of three integers.
rg --type python $'img_shape'

Length of output: 59



Script:

#!/bin/bash
# Description: Verify all usages of `img_shape` handle three separate integers.

# Test: Search for the property usage. Expect: Proper handling of three integers.
rg --type py $'img_shape'

Length of output: 616



Script:

#!/bin/bash
# Description: Extract and review code snippets for each usage of `img_shape` to ensure proper handling of three integers.

# Extract lines around each usage of `img_shape` for context.
rg --type py --context 5 $'img_shape'

Length of output: 3139


482-483: Addition of image_format attribute in HDF5Video supports new embedding functionality.

Also applies to: 494-494

Verification successful

The image_format attribute is correctly used in image decoding and embedding.

  • sleap_io/io/video.py: Used in iio.imread(BytesIO(img_string), extension=f".{self.image_format}")
  • sleap_io/io/slp.py: Used in cv2.imencode("." + image_format, frame)[1]
Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify `image_format` is correctly used in image decoding.

# Test: Search for the attribute usage. Expect: Correct usage in image decoding.
rg --type python $'image_format'

Length of output: 62



Script:

#!/bin/bash
# Description: Verify `image_format` is correctly used in image decoding.

# Test: Search for the attribute usage. Expect: Correct usage in image decoding.
rg 'image_format'

Length of output: 1263


Line range hint 593-611: Use of self.image_format in decode_embedded ensures correct decoding of embedded images.

Comment on lines 192 to 304
if image_format == "hdf5":
img_data = frame
else:
if "cv2" in sys.modules:
img_data = np.squeeze(
cv2.imencode("." + image_format, frame)[1]
).astype("int8")
else:
img_data = np.frombuffer(
iio.imwrite(
"<bytes>", frame.squeeze(axis=-1), extension="." + image_format
),
dtype="int8",
)

imgs_data.append(img_data)

# Write the image data to the labels file.
with h5py.File(labels_path, "a") as f:
if image_format == "hdf5":
f.create_dataset(
f"{group}/video", data=imgs_data, compression="gzip", chunks=True
)
else:
if fixed_length:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data), max(len(img) for img in imgs_data)),
dtype="int8",
compression="gzip",
)
for i, img in enumerate(imgs_data):
ds[i, : len(img)] = img
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass

video_path = video_path.as_posix()

if "filenames" in backend:
# This is an ImageVideo.
# TODO: Path resolution.
video_path = backend["filenames"]

try:
backend = VideoBackend.from_filename(
video_path,
dataset=backend.get("dataset", None),
grayscale=backend.get("grayscale", None),
input_format=backend.get("input_format", None),
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data),),
dtype=h5py.special_dtype(vlen=np.dtype("int8")),
)
for i, img in enumerate(imgs_data):
ds[i] = img

# Store metadata.
ds.attrs["format"] = image_format
(
ds.attrs["frames"],
ds.attrs["height"],
ds.attrs["width"],
ds.attrs["channels"],
) = video.shape

# Store frame indices.
f.create_dataset(f"{group}/frame_numbers", data=frame_inds)

# Store source video.
if video.source_video is not None:
# If this is already an embedded dataset, retain the previous source video.
source_video = video.source_video
embedded_video = video
video.replace_filename(labels_path, open=False)
else:
source_video = video
embedded_video = Video(
filename=labels_path,
backend=VideoBackend.from_filename(
labels_path,
dataset=f"{group}/video",
grayscale=video.grayscale,
keep_open=False,
),
source_video=source_video,
)
except ValueError:
backend = None
video_objects.append(Video(filename=video_path, backend=backend))
return video_objects

grp = f.require_group(f"{group}/source_video")
grp.attrs["json"] = json.dumps(
video_to_dict(source_video), separators=(",", ":")
)

return embedded_video
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor the embed_video function to improve modularity and error handling.

The embed_video function is complex and handles multiple tasks. Consider refactoring it to separate concerns, such as image data loading, encoding, and writing to the file into different functions. Additionally, improve error handling for image format validation and HDF5 operations to make the function more robust.

Comment on lines 377 to 380
elif isinstance(to_embed, list):
to_embed = to_embed
else:
raise ValueError(f"Invalid value for to_embed: {to_embed}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle invalid to_embed values more robustly.

-    else:
+    else:
+        raise ValueError(f"Invalid value for to_embed: {to_embed}")

The current handling of invalid to_embed values is to silently pass, which could lead to unexpected behavior. It would be better to raise an exception to notify the caller of the issue.


Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
elif isinstance(to_embed, list):
to_embed = to_embed
else:
raise ValueError(f"Invalid value for to_embed: {to_embed}")
elif isinstance(to_embed, list):
to_embed = to_embed
else:
raise ValueError(f"Invalid value for to_embed: {to_embed}")
raise ValueError(f"Invalid value for to_embed: {to_embed}")

@@ -86,6 +86,7 @@ def read_labels(
frames: List[LabeledFrame] = []
# Video name is the pose file minus the suffix
video_name = re.sub(r"(_pose_est_v[2-6])?\.h5", ".avi", labels_path)
video = Video.from_filename(video_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Several instances of read_labels are still passing filenames instead of Video objects. These need to be updated to align with the new implementation.

  • tests/io/test_slp.py

    • labels = read_labels(slp_typical)
    • labels = read_labels(slp_simple_skel)
    • labels = read_labels(slp_minimal)
    • labels = read_labels(slp_predictions_with_provenance)
    • labels = read_labels(slp_real_data)
    • labels = read_labels(centered_pair)
    • labels = read_labels(fn)
    • labels = read_labels(tmp_path / "test.slp")
    • labels = read_labels(slp_imgvideo)
    • labels = read_labels(tmpdir / "test.slp")
    • labels = read_labels(slp_minimal_pkg)
    • labels = read_labels(str(tmpdir / "roundtrip.pkg.slp"))
    • labels = read_labels(labels_path)
  • tests/io/test_labelstudio.py

    • labels = slp_read_labels(slp_typical)
    • labels = slp_read_labels(slp_simple_skel)
    • labels = slp_read_labels(slp_minimal)
    • ls_labels = read_labels(file_path, skeleton)
  • tests/io/test_jabs.py

    • slp_labels = read_labels(jabs_real_data_v5)
    • labels = read_labels(jabs_real_data_v5)
  • sleap_io/io/main.py

    • return slp.read_labels(filename)
    • return labelstudio.read_labels(filename, skeleton=skeleton)
    • return jabs.read_labels(filename, skeleton=skeleton)

Please update these instances to pass Video objects instead of filenames.

Analysis chain

Use of existing Video object in read_labels enhances efficiency and code reuse.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `read_labels` pass a `Video` object.

# Test: Search for the function usage. Expect: Only occurances with a `Video` object.
rg --type python $'read_labels'

Length of output: 61



Script:

#!/bin/bash
# Description: Verify all function calls to `read_labels` pass a `Video` object.

# Test: Search for the function usage. Expect: Only occurrences with a `Video` object.
rg 'read_labels' --glob '*.py'

Length of output: 2566



Script:

#!/bin/bash
# Description: Extract lines around each `read_labels` call to verify the arguments being passed.

# Test: Extract context around `read_labels` calls.
rg 'read_labels' --glob '*.py' -C 3

Length of output: 11597

Comment on lines +63 to +70
@property
def has_predicted_instances(self) -> bool:
"""Return True if the frame has any predicted instances."""
for inst in self.instances:
if type(inst) == PredictedInstance:
return True
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Lack of Test Coverage and Usage for has_predicted_instances

The property has_predicted_instances is defined in sleap_io/model/labeled_frame.py but is not covered by any tests and is not used anywhere else in the codebase.

  • Consider adding test cases to ensure the functionality of this property.
  • Review the necessity of this property if it is not being used.
Analysis chain

The implementation of has_predicted_instances property is correct. However, consider adding test coverage for this property to ensure its functionality under various conditions.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify test coverage for `has_predicted_instances`.

# Test: Search for tests covering the property. Expect: At least one test case.
rg --type python $'has_predicted_instances' tests/

Length of output: 80



Script:

#!/bin/bash
# Description: Verify test coverage for `has_predicted_instances`.

# Test: Search for tests covering the property. Expect: At least one test case.
rg --type py 'has_predicted_instances' tests/

Length of output: 45



Script:

#!/bin/bash
# Description: Verify usage of `has_predicted_instances` in the codebase.

# Search for the usage of `has_predicted_instances` in the entire codebase.
rg 'has_predicted_instances'

Length of output: 107

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between b71646f and 4e5734b.
Files selected for processing (3)
  • .gitignore (1 hunks)
  • CONTRIBUTING.md (1 hunks)
  • pyproject.toml (5 hunks)
Files skipped from review due to trivial changes (1)
  • .gitignore
Additional Context Used
LanguageTool (6)
CONTRIBUTING.md (6)

Near line 48: The phrase ‘feel free to’ is used quite frequently. Consider using a less frequent alternative to set your writing apart from others and make it sound more professional.
Context: ...in` with a description of your changes. Feel free to create a "Draft" pull request to work o...


Near line 65: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...s need to be introduced (or if versions need to be fenced), specify these in [`setup.cf...


Near line 79: Possible typo: you repeated a word
Context: ...vironment.yml](environment.yml). ### Testing Testing is done via [pytest`](https://docs.pyt...


Near line 109: Use a comma before ‘so’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...to set this up in your local environment so your code is auto-formatted before push...


Near line 146: When ‘88-character’ is used as a modifier, it is usually spelled with a hyphen.
Context: ... - The first line should fit within the 88 character limit, be on the same line as the initi...


Near line 146: Unpaired symbol: ‘"’ seems to be missing
Context: ...mit, be on the same line as the initial """, and should use imperative tense (e.g....

Additional comments not posted (7)
pyproject.toml (6)

10-10: Added new author "Talmo Pereira" to the project metadata.

This addition is consistent with the PR's user information, ensuring that all contributors are acknowledged.


21-21: Updated Python version support to include 3.12.

This change ensures compatibility with the latest Python version, which is a good practice for maintaining the package's relevance and usability.


32-32: Added "av" to the list of dependencies.

This addition is likely related to the new functionalities around video handling as mentioned in the AI-generated summary. It's important to ensure that all necessary dependencies are declared.


43-43: Added "pytest-watch" to the development dependencies.

This tool will enhance the development experience by allowing live testing, which is beneficial for a project with a focus on testing and continuous integration.


53-53: Added "mkdocs-literate-nav" to the development dependencies.

This addition likely improves the navigation and usability of the project's documentation, aligning with best practices for open-source projects.


56-56: Updated project homepage URL.

The update to the homepage URL ensures that users and contributors have the correct link to more information about the project.

CONTRIBUTING.md (1)

96-104: Added a new section on live coverage.

This addition provides detailed steps for setting up live test coverage, which is a valuable resource for contributors aiming to maintain high code quality.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Outside diff range and nitpick comments (4)
tests/model/test_labels.py (1)

Line range hint 3-3: Remove unused imports to clean up the code.

- from numpy.testing import assert_equal
- from sleap_io import load_video

Also applies to: 13-13

tests/io/test_slp.py (2)

Line range hint 60-60: Replace type comparisons with isinstance() for better practice and consistency.

- assert type(provenance) == dict
+ assert isinstance(provenance, dict)
- assert type(saved_labels.video.backend) == type(labels.video.backend)
+ assert isinstance(saved_labels.video.backend, type(labels.video.backend))

Also applies to: 194-194

Tools
Ruff

36-36: sleap_io.io.utils.read_hdf5_dataset imported but unused (F401)


38-38: simplejson imported but unused (F401)


Line range hint 6-6: Remove unused imports to clean up the code.

- from sleap_io import Edge, LabeledFrame, Track, Point, PredictedPoint
- from sleap_io.io.slp import serialize_skeletons, read_instances
- from sleap_io.io.utils import read_hdf5_dataset
- import simplejson as json

Also applies to: 9-9, 10-10, 11-11, 12-12, 25-25, 29-29, 36-36, 38-38

Tools
Ruff

36-36: sleap_io.io.utils.read_hdf5_dataset imported but unused (F401)


38-38: simplejson imported but unused (F401)

sleap_io/model/labels.py (1)

Line range hint 17-17: Remove unused import and replace type comparisons with isinstance() for better practice and consistency.

- from sleap_io import Instance
- if type(key) == int:
+ if isinstance(key, int):
- if type(key) == slice:
+ if isinstance(key, slice):
- if type(inst) == PredictedInstance and inst.track is not None:
+ if isinstance(inst, PredictedInstance) and inst.track is not None:

Also applies to: 73-73, 77-77, 158-158

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 4e5734b and a76680a.

Files selected for processing (5)
  • sleap_io/io/main.py (1 hunks)
  • sleap_io/io/slp.py (4 hunks)
  • sleap_io/model/labels.py (2 hunks)
  • tests/io/test_slp.py (3 hunks)
  • tests/model/test_labels.py (1 hunks)
Additional context used
Ruff
tests/model/test_labels.py

3-3: numpy.testing.assert_equal imported but unused (F401)


13-13: sleap_io.load_video imported but unused (F401)

tests/io/test_slp.py

6-6: sleap_io.Edge imported but unused (F401)


9-9: sleap_io.LabeledFrame imported but unused (F401)


10-10: sleap_io.Track imported but unused (F401)


11-11: sleap_io.Point imported but unused (F401)


12-12: sleap_io.PredictedPoint imported but unused (F401)


25-25: sleap_io.io.slp.serialize_skeletons imported but unused (F401)


29-29: Redefinition of unused read_instances from line 22 (F811)


29-29: sleap_io.io.slp.read_instances imported but unused (F401)


36-36: sleap_io.io.utils.read_hdf5_dataset imported but unused (F401)


38-38: simplejson imported but unused (F401)


60-60: Do not compare types, use isinstance() (E721)


194-194: Do not compare types, use isinstance() (E721)

sleap_io/model/labels.py

17-17: sleap_io.Instance imported but unused (F401)


73-73: Do not compare types, use isinstance() (E721)


77-77: Do not compare types, use isinstance() (E721)


158-158: Do not compare types, use isinstance() (E721)

GitHub Check: codecov/patch
sleap_io/io/slp.py

[warning] 35-36: sleap_io/io/slp.py#L35-L36
Added lines #L35 - L36 were not covered by tests


[warning] 73-73: sleap_io/io/slp.py#L73
Added line #L73 was not covered by tests


[warning] 105-106: sleap_io/io/slp.py#L105-L106
Added lines #L105 - L106 were not covered by tests


[warning] 145-145: sleap_io/io/slp.py#L145
Added line #L145 was not covered by tests


[warning] 227-227: sleap_io/io/slp.py#L227
Added line #L227 was not covered by tests


[warning] 234-234: sleap_io/io/slp.py#L234
Added line #L234 was not covered by tests


[warning] 246-246: sleap_io/io/slp.py#L246
Added line #L246 was not covered by tests


[warning] 260-260: sleap_io/io/slp.py#L260
Added line #L260 was not covered by tests


[warning] 265-266: sleap_io/io/slp.py#L265-L266
Added lines #L265 - L266 were not covered by tests


[warning] 372-373: sleap_io/io/slp.py#L372-L373
Added lines #L372 - L373 were not covered by tests


[warning] 375-375: sleap_io/io/slp.py#L375
Added line #L375 was not covered by tests

Additional comments not posted (4)
sleap_io/io/main.py (1)

22-35: The addition of the embed parameter in save_slp function is well-implemented and aligns with the PR's objective to support image embedding.

Verification successful

The embed parameter is correctly passed through kwargs in the save_file function, ensuring that it reaches the save_slp function as intended. This confirms that the new embed parameter is handled properly in the codebase.

  • tests/io/test_main.py: The save_file function is called with various parameters, including embed.
  • sleap_io/model/labels.py: The save_file function is called with embed explicitly passed.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that the new `embed` parameter is handled correctly in all calls to `save_slp`.

# Test: Search for the function usage. Expect: Proper handling of the new parameter.
rg --type python $'save_slp'

Length of output: 58



Script:

#!/bin/bash
# Description: Verify that the new `embed` parameter is handled correctly in all calls to `save_slp`.

# Test: Search for the function usage. Expect: Proper handling of the new parameter.
rg --type py $'save_slp'

Length of output: 157



Script:

#!/bin/bash
# Description: Extract the relevant lines around the call to `save_slp` in `sleap_io/io/main.py` to verify the handling of the `embed` parameter.

# Extract lines around the call to `save_slp`
rg --type py -A 5 -B 5 $'save_slp(labels, filename, **kwargs)' sleap_io/io/main.py

Length of output: 83



Script:

#!/bin/bash
# Description: Extract the relevant lines around the call to `save_slp` in `sleap_io/io/main.py` to verify the handling of the `embed` parameter.

# Extract lines around the call to `save_slp`
rg --type py -C 10 $'save_slp' sleap_io/io/main.py

Length of output: 1493



Script:

#!/bin/bash
# Description: Verify if `embed` is being passed through `kwargs` in the calling context of `save_slp`.

# Search for the context where `save_slp` is called with `kwargs`
rg --type py -C 10 $'save_slp' sleap_io/io/main.py | rg -B 10 '**kwargs'

Length of output: 155



Script:

#!/bin/bash
# Description: Verify if `embed` is being passed through `kwargs` in the calling context of `save_slp`.

# Search for the context where `save_slp` is called with `kwargs`
rg --type py -C 10 $'save_slp' sleap_io/io/main.py | rg -B 10 'kwargs'

Length of output: 969



Script:

#!/bin/bash
# Description: Verify if `embed` is included in `kwargs` in the calling context of `save_slp`.

# Search for the definition and population of `kwargs` in the relevant file
rg --type py -C 10 'kwargs' sleap_io/io/main.py

Length of output: 3001



Script:

#!/bin/bash
# Description: Verify if `embed` is included in `kwargs` in the calling context of `save_file`.

# Search for the context where `save_file` is called
rg --type py -C 10 'save_file' 

Length of output: 6541

tests/model/test_labels.py (1)

263-274: The test_replace_videos function correctly tests the replacement of videos in labels, ensuring that video filenames are updated correctly.

tests/io/test_slp.py (1)

268-280: The test_pkg_roundtrip function correctly tests the roundtrip of saving and loading labels, ensuring that the video backend and other properties are preserved.

sleap_io/model/labels.py (1)

282-300: The save method in the Labels class has been correctly updated to support specifying frames to embed when saving labels, aligning with the PR's objective.

Comment on lines +75 to +77
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand path resolution capabilities.

Would you like me to help implement more complex path finding strategies or should I open a GitHub issue to track this enhancement?

Comment on lines +33 to +36
try:
import cv2
except ImportError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle ImportError for cv2 more gracefully.

-    pass
+    sys.exit("Failed to import cv2, which is required for image processing.")

The current handling of the ImportError for cv2 is to silently pass, which could lead to failures later in the code if cv2 functions are called. It would be better to handle this error more explicitly, either by logging an error message or by exiting the program, depending on the criticality of cv2 in the application.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
try:
import cv2
except ImportError:
pass
try:
import cv2
except ImportError:
sys.exit("Failed to import cv2, which is required for image processing.")
Tools
GitHub Check: codecov/patch

[warning] 35-36: sleap_io/io/slp.py#L35-L36
Added lines #L35 - L36 were not covered by tests

Comment on lines +46 to +113
def make_video(
labels_path: str, video_json: dict, video_ind: int | None = None
) -> Video:
"""Create a `Video` object from a JSON dictionary.

Args:
labels_path: A string path to the SLEAP labels file.
video_json: A dictionary containing the video metadata.
video_ind: The index of the video in the labels file. This is used to try to
recover the source video for embedded videos. This is skipped if `None`.
"""
backend_metadata = video_json["backend"]
video_path = backend_metadata["filename"]

# Marker for embedded videos.
source_video = None
is_embedded = False
if video_path == ".":
video_path = labels_path
is_embedded = True

# Basic path resolution.
video_path = Path(video_path)
if not video_path.exists():
# Check for the same filename in the same directory as the labels file.
video_path_ = Path(labels_path).parent / video_path.name
if video_path_.exists():
video_path = video_path_
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass

# Convert video path to string.
video_path = video_path.as_posix()

if is_embedded:
# Try to recover the source video.
with h5py.File(labels_path, "r") as f:
if f"video{video_ind}" in f:
source_video_json = json.loads(
f[f"video{video_ind}/source_video"].attrs["json"]
)
source_video = make_video(
labels_path, source_video_json, video_ind=None
)

if "filenames" in backend_metadata:
# This is an ImageVideo.
# TODO: Path resolution.
video_path = backend_metadata["filenames"]

try:
backend = VideoBackend.from_filename(
video_path,
dataset=backend_metadata.get("dataset", None),
grayscale=backend_metadata.get("grayscale", None),
input_format=backend_metadata.get("input_format", None),
)
except ValueError:
backend = None

return Video(
filename=video_path,
backend=backend,
backend_metadata=backend_metadata,
source_video=source_video,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor make_video function to improve readability and error handling.

The make_video function is quite lengthy and handles multiple responsibilities, which makes it hard to read and maintain. Consider breaking it down into smaller, more focused functions. Additionally, the error handling when the video path does not exist is minimal. It might be beneficial to raise an exception or return a specific error message.

Tools
GitHub Check: codecov/patch

[warning] 73-73: sleap_io/io/slp.py#L73
Added line #L73 was not covered by tests


[warning] 105-106: sleap_io/io/slp.py#L105-L106
Added lines #L105 - L106 were not covered by tests

Comment on lines +192 to +305
img_data = frame
else:
if "cv2" in sys.modules:
img_data = np.squeeze(
cv2.imencode("." + image_format, frame)[1]
).astype("int8")
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass

video_path = video_path.as_posix()

if "filenames" in backend:
# This is an ImageVideo.
# TODO: Path resolution.
video_path = backend["filenames"]

try:
backend = VideoBackend.from_filename(
video_path,
dataset=backend.get("dataset", None),
grayscale=backend.get("grayscale", None),
input_format=backend.get("input_format", None),
img_data = np.frombuffer(
iio.imwrite(
"<bytes>", frame.squeeze(axis=-1), extension="." + image_format
),
dtype="int8",
)

imgs_data.append(img_data)

# Write the image data to the labels file.
with h5py.File(labels_path, "a") as f:
if image_format == "hdf5":
f.create_dataset(
f"{group}/video", data=imgs_data, compression="gzip", chunks=True
)
except ValueError:
backend = None
video_objects.append(Video(filename=video_path, backend=backend))
return video_objects
else:
if fixed_length:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data), max(len(img) for img in imgs_data)),
dtype="int8",
compression="gzip",
)
for i, img in enumerate(imgs_data):
ds[i, : len(img)] = img
else:
ds = f.create_dataset(
f"{group}/video",
shape=(len(imgs_data),),
dtype=h5py.special_dtype(vlen=np.dtype("int8")),
)
for i, img in enumerate(imgs_data):
ds[i] = img

# Store metadata.
ds.attrs["format"] = image_format
(
ds.attrs["frames"],
ds.attrs["height"],
ds.attrs["width"],
ds.attrs["channels"],
) = video.shape

# Store frame indices.
f.create_dataset(f"{group}/frame_numbers", data=frame_inds)

# Store source video.
if video.source_video is not None:
# If this is already an embedded dataset, retain the previous source video.
source_video = video.source_video
embedded_video = video
video.replace_filename(labels_path, open=False)
else:
source_video = video
embedded_video = Video(
filename=labels_path,
backend=VideoBackend.from_filename(
labels_path,
dataset=f"{group}/video",
grayscale=video.grayscale,
keep_open=False,
),
source_video=source_video,
)

grp = f.require_group(f"{group}/source_video")
grp.attrs["json"] = json.dumps(
video_to_dict(source_video), separators=(",", ":")
)

return embedded_video

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor the embed_video function to improve modularity and error handling.

The embed_video function is complex and handles multiple tasks. Consider refactoring it to separate concerns, such as image data loading, encoding, and writing to the file into different functions. Additionally, improve error handling for image format validation and HDF5 operations to make the function more robust.

Tools
GitHub Check: codecov/patch

[warning] 227-227: sleap_io/io/slp.py#L227
Added line #L227 was not covered by tests


[warning] 234-234: sleap_io/io/slp.py#L234
Added line #L234 was not covered by tests


[warning] 246-246: sleap_io/io/slp.py#L246
Added line #L246 was not covered by tests


[warning] 260-260: sleap_io/io/slp.py#L260
Added line #L260 was not covered by tests


[warning] 265-266: sleap_io/io/slp.py#L265-L266
Added lines #L265 - L266 were not covered by tests

Comment on lines +69 to +77
if not video_path.exists():
# Check for the same filename in the same directory as the labels file.
video_path_ = Path(labels_path).parent / video_path.name
if video_path_.exists():
video_path = video_path_
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling for non-existent video paths.

-    pass
+    raise FileNotFoundError(f"Video path {video_path} does not exist.")

The current handling of non-existent video paths is to silently pass, which could lead to failures later in the code if video operations are attempted on a non-existent path. It would be better to raise an exception to notify the caller of the issue.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
if not video_path.exists():
# Check for the same filename in the same directory as the labels file.
video_path_ = Path(labels_path).parent / video_path.name
if video_path_.exists():
video_path = video_path_
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
pass
if not video_path.exists():
# Check for the same filename in the same directory as the labels file.
video_path_ = Path(labels_path).parent / video_path.name
if video_path_.exists():
video_path = video_path_
else:
# TODO (TP): Expand capabilities of path resolution to support more
# complex path finding strategies.
raise FileNotFoundError(f"Video path {video_path} does not exist.")
Tools
GitHub Check: codecov/patch

[warning] 73-73: sleap_io/io/slp.py#L73
Added line #L73 was not covered by tests

Comment on lines +135 to +190
def video_to_dict(video: Video) -> dict:
"""Convert a `Video` object to a JSON-compatible dictionary.

Args:
video: A `Video` object to convert.

Returns:
A dictionary containing the video metadata.
"""
if video.backend is None:
return {"filename": video.filename, "backend": video.backend_metadata}

if type(video.backend) == MediaVideo:
return {
"filename": video.filename,
"backend": {
"type": "MediaVideo",
"shape": video.shape,
"filename": video.filename,
"grayscale": video.grayscale,
"bgr": True,
"dataset": "",
"input_format": "",
},
}

elif type(video.backend) == HDF5Video:
return {
"filename": video.filename,
"backend": {
"type": "HDF5Video",
"shape": video.shape,
"filename": (
"." if video.backend.has_embedded_images else video.filename
),
"dataset": video.backend.dataset,
"input_format": video.backend.input_format,
"convert_range": False,
"has_embedded_images": video.backend.has_embedded_images,
},
}

elif type(video.backend) == ImageVideo:
return {
"filename": video.filename,
"backend": {
"type": "ImageVideo",
"shape": video.shape,
"filename": video.backend.filename[0],
"filenames": video.backend.filename,
"dataset": video.backend_metadata.get("dataset", None),
"grayscale": video.grayscale,
"input_format": video.backend_metadata.get("input_format", None),
},
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize the video_to_dict function by reducing redundancy.

-    if type(video.backend) == MediaVideo:
-        return {
-            "filename": video.filename,
-            "backend": {
-                "type": "MediaVideo",
-                "shape": video.shape,
-                "filename": video.filename,
-                "grayscale": video.grayscale,
-                "bgr": True,
-                "dataset": "",
-                "input_format": "",
-            },
-        }
+    backend_type = type(video.backend).__name__
+    return {
+        "filename": video.filename,
+        "backend": {
+            "type": backend_type,
+            "shape": video.shape,
+            "filename": video.filename,
+            "grayscale": video.grayscale,
+            "bgr": (backend_type == "MediaVideo"),
+            "dataset": video.backend_metadata.get("dataset", ""),
+            "input_format": video.backend_metadata.get("input_format", ""),
+        },
+    }

The video_to_dict function contains repetitive code for different types of backends. This can be optimized by using a more generic approach that reduces redundancy and improves maintainability.

Committable suggestion was skipped due to low confidence.

Tools
GitHub Check: codecov/patch

[warning] 145-145: sleap_io/io/slp.py#L145
Added line #L145 was not covered by tests

elif isinstance(embed, list):
embed = embed
else:
raise ValueError(f"Invalid value for embed: {embed}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle invalid to_embed values more robustly.

-    else:
+    else:
+        raise ValueError(f"Invalid value for to_embed: {embed}")

The current handling of invalid to_embed values is to silently pass, which could lead to unexpected behavior. It would be better to raise an exception to notify the caller of the issue.

Committable suggestion was skipped due to low confidence.

Tools
GitHub Check: codecov/patch

[warning] 375-375: sleap_io/io/slp.py#L375
Added line #L375 was not covered by tests

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a76680a and 5eee083.

Files selected for processing (4)
  • sleap_io/model/video.py (6 hunks)
  • tests/io/test_slp.py (3 hunks)
  • tests/model/test_labeled_frame.py (5 hunks)
  • tests/model/test_video.py (2 hunks)
Additional context used
Ruff
sleap_io/model/video.py

9-9: Redefinition of unused Optional from line 9 (F811)


112-112: Do not use bare except (E722)

tests/io/test_slp.py

6-6: sleap_io.Edge imported but unused (F401)


9-9: sleap_io.LabeledFrame imported but unused (F401)


10-10: sleap_io.Track imported but unused (F401)


11-11: sleap_io.Point imported but unused (F401)


12-12: sleap_io.PredictedPoint imported but unused (F401)


25-25: sleap_io.io.slp.serialize_skeletons imported but unused (F401)


29-29: Redefinition of unused read_instances from line 22 (F811)


29-29: sleap_io.io.slp.read_instances imported but unused (F401)


36-36: sleap_io.io.utils.read_hdf5_dataset imported but unused (F401)


38-38: simplejson imported but unused (F401)


60-60: Do not compare types, use isinstance() (E721)


194-194: Do not compare types, use isinstance() (E721)

Additional comments not posted (14)
tests/model/test_video.py (4)

23-23: Ensure the length assertion is consistent with expected video properties.


61-62: Confirm that the video is correctly opened and the backend is set as expected.


61-62: Check that the video is properly opened and the backend type is correctly identified.


61-62: Verify that the video backend is correctly set after replacing the filename.

tests/model/test_labeled_frame.py (3)

31-33: Confirm that the properties has_predicted_instances and has_user_instances are correctly evaluated.


Line range hint 49-60: Ensure that the properties has_predicted_instances and has_user_instances reflect the state after predictions are removed.


88-90: Check that the image from the LabeledFrame matches the first frame of the video.

sleap_io/model/video.py (4)

15-15: Ensure that instances of the Video class are not considered equal based on their attributes.


29-34: Confirm that the new attributes backend_metadata and source_video are correctly initialized and used.


Line range hint 62-78: Check that the source_video parameter is correctly handled in the from_filename method.


50-54: Verify that the video is opened if it exists during the post-initialization phase.

tests/io/test_slp.py (3)

108-122: Ensure that the compare_videos function correctly compares the properties of reference and test videos.


286-315: Verify that the embedding functionality works as expected with different parameters.


317-341: Check that embedding in two rounds maintains the correct video properties and embedded frame indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Save .pkg.slp files
1 participant