-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safer video loading from SLP #119
Conversation
Warning Rate limit exceeded@talmo has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 42 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe changes introduce additional parameters to several functions in the SLEAP library, specifically for managing video backend operations. The Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
0b07a40
to
f314fe6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
sleap_io/io/main.py (2)
10-22
: LGTM! Consider a minor docstring improvement.The changes to the
load_slp
function look good and align well with the PR objectives. The newopen_videos
parameter provides the desired flexibility for safer video loading.Consider adding a brief example in the docstring to illustrate how to use the
open_videos
parameter:def load_slp(filename: str, open_videos: bool = True) -> Labels: """Load a SLEAP dataset. Args: filename: Path to a SLEAP labels file (`.slp`). open_videos: If `True` (the default), attempt to open the video backend for I/O. If `False`, the backend will not be opened (useful for reading metadata when the video files are not available). Returns: The dataset as a `Labels` object. Example: # Load dataset without opening video backend labels = load_slp("path/to/dataset.slp", open_videos=False) """ return slp.read_labels(filename, open_videos=open_videos)This addition would provide users with a clear example of how to use the new parameter.
Line range hint
168-203
: Updateload_file
function to supportopen_videos
parameterThe
load_file
function should be updated to support the newopen_videos
parameter when loading SLP files. This ensures consistency and provides the same functionality when using this generic loading function.Please update the
load_file
function as follows:def load_file( filename: str | Path, format: Optional[str] = None, **kwargs ) -> Union[Labels, Video]: """Load a file and return the appropriate object. Args: filename: Path to a file. format: Optional format to load as. If not provided, will be inferred from the file extension. Available formats are: "slp", "nwb", "labelstudio", "jabs" and "video". + open_videos: If loading an SLP file, controls whether to open the video backend. + Defaults to True. See `load_slp` for more details. Returns: A `Labels` or `Video` object. """ if isinstance(filename, Path): filename = filename.as_posix() if format is None: if filename.endswith(".slp"): format = "slp" elif filename.endswith(".nwb"): format = "nwb" elif filename.endswith(".json"): format = "json" elif filename.endswith(".h5"): format = "jabs" else: for vid_ext in Video.EXTS: if filename.endswith(vid_ext): format = "video" break if format is None: raise ValueError(f"Could not infer format from filename: '{filename}'.") if filename.endswith(".slp"): - return load_slp(filename, **kwargs) + open_videos = kwargs.pop('open_videos', True) + return load_slp(filename, open_videos=open_videos, **kwargs) elif filename.endswith(".nwb"): return load_nwb(filename, **kwargs) elif filename.endswith(".json"): return load_labelstudio(filename, **kwargs) elif filename.endswith(".h5"): return load_jabs(filename, **kwargs) elif format == "video": return load_video(filename, **kwargs)This change ensures that the
open_videos
parameter is properly handled when loading SLP files through theload_file
function, maintaining consistency with the newload_slp
functionality.sleap_io/io/slp.py (1)
1022-1029
: Consider aligning parameter names for consistencyIn the
read_labels
function, the parameter is namedopen_videos
, whereas inread_videos
andmake_video
, it's namedopen_backend
. For consistency and clarity, consider using the same parameter name across these functions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- sleap_io/io/main.py (1 hunks)
- sleap_io/io/slp.py (6 hunks)
🧰 Additional context used
🪛 Ruff
sleap_io/io/slp.py
120-120: Do not use bare
except
(E722)
sleap_io/io/slp.py
Outdated
def read_videos(labels_path: str, open_backend: bool = False) -> list[Video]: | ||
"""Read `Video` dataset in a SLEAP labels file. | ||
|
||
Args: | ||
labels_path: A string path to the SLEAP labels file. | ||
open_backend: If `True` (the default), attempt to open the video backend for | ||
I/O. If `False`, the backend will not be opened (useful for reading metadata | ||
when the video files are not available). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the docstring for open_backend
default value
In the read_videos
function, the parameter open_backend
defaults to False
, but the docstring states that it defaults to True
. Please update the docstring to reflect the correct default value.
Apply this diff to fix the docstring:
Args:
labels_path: A string path to the SLEAP labels file.
- open_backend: If `True` (the default), attempt to open the video backend for
- I/O. If `False`, the backend will not be opened (useful for reading metadata
- when the video files are not available).
+ open_backend: If `True`, attempt to open the video backend for
+ I/O. If `False` (the default), the backend will not be opened (useful for reading metadata
+ when the video files are not available).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def read_videos(labels_path: str, open_backend: bool = False) -> list[Video]: | |
"""Read `Video` dataset in a SLEAP labels file. | |
Args: | |
labels_path: A string path to the SLEAP labels file. | |
open_backend: If `True` (the default), attempt to open the video backend for | |
I/O. If `False`, the backend will not be opened (useful for reading metadata | |
when the video files are not available). | |
def read_videos(labels_path: str, open_backend: bool = False) -> list[Video]: | |
"""Read `Video` dataset in a SLEAP labels file. | |
Args: | |
labels_path: A string path to the SLEAP labels file. | |
open_backend: If `True`, attempt to open the video backend for | |
I/O. If `False` (the default), the backend will not be opened (useful for reading metadata | |
when the video files are not available). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
sleap_io/io/slp.py (1)
1042-1055
: LGTM with a minor suggestion: Consider renaming parameter for consistencyThe changes to the
read_labels
function are well-implemented and consistent with the updates in theread_videos
function. The newopen_videos
parameter is properly documented and correctly passed toread_videos
, enhancing the flexibility of video loading as per the PR objectives.However, for better consistency across the module, consider renaming the parameter from
open_videos
toopen_backend
to match the naming used in other functions likeread_videos
andmake_video
.Here's a suggested change:
-def read_labels(labels_path: str, open_videos: bool = True) -> Labels: +def read_labels(labels_path: str, open_backend: bool = True) -> Labels: """Read a SLEAP labels file. Args: labels_path: A string path to the SLEAP labels file. - open_videos: If `True` (the default), attempt to open the video backend for + open_backend: If `True` (the default), attempt to open the video backend for I/O. If `False`, the backend will not be opened (useful for reading metadata when the video files are not available). Returns: The processed `Labels` object. """ tracks = read_tracks(labels_path) - videos = read_videos(labels_path, open_backend=open_videos) + videos = read_videos(labels_path, open_backend=open_backend)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- sleap_io/io/slp.py (9 hunks)
- sleap_io/model/video.py (5 hunks)
- tests/io/test_slp.py (1 hunks)
🧰 Additional context used
🪛 Ruff
sleap_io/io/slp.py
187-187: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
tests/io/test_slp.py
361-361: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
🔇 Additional comments (12)
sleap_io/model/video.py (7)
37-41
: LGTM: Clear documentation for the newopen_backend
parameter.The added documentation for the
open_backend
parameter is clear and concise. It effectively explains both the default behavior and when to set it toFalse
, aligning well with the PR objectives of safer video loading.
55-55
: LGTM:open_backend
parameter correctly added.The
open_backend
parameter is correctly added to the class with a default value ofTrue
. This implementation matches the documentation and PR objectives while maintaining backward compatibility.
61-62
: LGTM:__attrs_post_init__
method correctly updated.The
__attrs_post_init__
method has been appropriately updated to checkself.open_backend
before attempting to open the backend. This change aligns with the newopen_backend
functionality and prevents automatic opening of the backend whenopen_backend
isFalse
.
190-196
: LGTM:__getitem__
method correctly updated with clear error handling.The
__getitem__
method has been appropriately updated to checkself.open_backend
before attempting to open the backend. The addedValueError
with clear instructions for the user is a good practice whenopen_backend
isFalse
and the backend is not open.
223-232
: LGTM:open
method signature and documentation updated correctly.The
open
method has been appropriately updated with a newfilename
parameter. The added documentation clearly explains the purpose of this new parameter, enhancing the flexibility of the video loading functionality as per the PR objectives.
249-251
: LGTM: New logic inopen
method correctly handles thefilename
parameter.The added logic in the
open
method correctly handles the newfilename
parameter by callingself.replace_filename
when a new filename is provided. The use ofopen=False
in thereplace_filename
call is appropriate, as it prevents immediate opening and allows the rest of theopen
method to handle the opening process.
Line range hint
1-300
: Overall assessment: Excellent implementation of safer and more flexible video loading.The changes introduced in this PR successfully implement the
open_backend
parameter and enhance the video loading functionality of theVideo
class. The modifications are consistent, well-documented, and align perfectly with the PR objectives. These changes provide users with more control over when and how video backends are opened, which is particularly useful for scenarios involving metadata reading from potentially unavailable video files.Key improvements:
- Addition of the
open_backend
parameter with clear documentation.- Updated
__attrs_post_init__
and__getitem__
methods to respect theopen_backend
setting.- Enhanced
open
method with a newfilename
parameter for increased flexibility.These changes collectively contribute to safer video loading practices and offer a more robust user experience when working with video files in the SLEAP library.
sleap_io/io/slp.py (5)
46-60
: LGTM: Well-implemented filename sanitization functionThe
sanitize_filename
function is a good addition that ensures consistent POSIX-compatible format for filenames. It handles both single filenames and lists of filenames, which adds flexibility to its usage.
Line range hint
62-146
: Excellent improvements to video handling flexibility and safetyThe changes to the
make_video
function significantly enhance its flexibility and safety:
- The new
open_backend
parameter allows users to control whether the video backend is opened, aligning with the PR objectives.- The function now handles cases where video files might not be available, improving robustness.
- The use of
sanitize_filename
ensures consistent path handling across different platforms.- Error handling has been improved by catching specific exceptions.
These changes make the function more versatile and safer to use in various scenarios.
Line range hint
150-169
: LGTM: Consistent implementation ofopen_backend
parameterThe changes to the
read_videos
function are well-implemented and consistent with the updates in themake_video
function. The newopen_backend
parameter is properly documented and correctly passed tomake_video
, enhancing the flexibility of video loading as per the PR objectives.
Line range hint
173-229
: LGTM: Improved filename consistency in video metadataThe changes to the
video_to_dict
function enhance the consistency of filename handling in video metadata. The use ofsanitize_filename
ensures that all filenames are converted to a canonical POSIX-compatible format, which improves cross-platform compatibility and robustness of the serialized video metadata.🧰 Tools
🪛 Ruff
187-187: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
Line range hint
1-1130
: Excellent improvements to video loading safety and flexibilityThe changes in this file significantly enhance the safety and flexibility of video loading in the SLEAP library:
- The new
sanitize_filename
function ensures consistent, cross-platform filename handling.- The introduction of the
open_backend
parameter in various functions allows users to control whether the video backend is opened, which is particularly useful when only metadata is needed.- Error handling has been improved, making the code more robust.
- The changes are well-documented and consistently implemented across different functions.
These improvements align perfectly with the PR objectives and provide users with more control over video loading behavior. The code quality is high, and the new functionality is added without breaking existing behavior.
Great job on these enhancements!
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #119 +/- ##
==========================================
+ Coverage 96.05% 96.13% +0.08%
==========================================
Files 15 15
Lines 2000 2020 +20
==========================================
+ Hits 1921 1942 +21
+ Misses 79 78 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Outside diff range and nitpick comments (10)
docs/index.md (1)
140-146
: LGTM! The updated comments improve clarity.The changes to the comments enhance the documentation's clarity:
- The updated comment for
replace_filenames()
more accurately describes the function's purpose.- The new comment about saving labels with updated paths provides useful context.
These improvements align well with good documentation practices.
For consistency, consider adding a comment before the
labels.replace_filenames()
call to explain its purpose, similar to the comments before the load and save operations. For example:# Replace video file paths with updated locations labels.replace_filenames(prefix_map={ "D:/data/sleap_projects": "/home/user/sleap_projects", "C:/Users/sleaper/Desktop/test": "/home/user/sleap_projects", })sleap_io/model/video.py (4)
38-42
: LGTM: Newopen_backend
parameter enhances flexibility.The addition of the
open_backend
parameter allows users to control whether the backend should be automatically opened, which aligns with the PR objectives. This change provides more flexibility in handling video backends, especially when the video file might not exist.Consider adding a brief example in the docstring to illustrate how to use
open_backend=False
:""" Example: # Create a Video object without opening the backend video = Video(filename="path/to/video.mp4", open_backend=False) # Manually open the backend when needed video.open() """
191-197
: LGTM: Improved error handling in__getitem__
method.The new error handling in the
__getitem__
method respects theopen_backend
flag and provides clear instructions to the user when the backend is not open. This change enhances the user experience and aligns with the PR objectives.Consider slightly rewording the error message for clarity:
raise ValueError( "Video backend is not open. Either call video.open() manually " "or set video.open_backend to True for automatic opening on frame read." )
201-218
: LGTM: Enhanced file accessibility check inexists
method.The
exists
method now usesis_file_accessible
, which improves safety by ensuring that the file can be read, not just that it exists. This change aligns with the PR objectives of enhancing safety in video loading.Consider optimizing the loop for checking multiple files using
all()
:if isinstance(self.filename, list): if check_all: return all(is_file_accessible(f) for f in self.filename) else: return is_file_accessible(self.filename[0]) return is_file_accessible(self.filename)This change would make the code more concise and potentially more efficient for large lists of files.
🧰 Tools
🪛 Ruff
212-215: Use
return all(is_file_accessible(f) for f in self.filename)
instead offor
loopReplace with
return all(is_file_accessible(f) for f in self.filename)
(SIM110)
227-227
: LGTM: Enhanced flexibility inopen
method with newfilename
parameter.The addition of the
filename
parameter to theopen
method allows for changing the filename when opening the video, providing more flexibility in handling video files. This change aligns with the PR objectives of improving video loading functionality.Consider updating the method's docstring to include information about the new
filename
parameter:def open( self, filename: Optional[str] = None, dataset: Optional[str] = None, grayscale: Optional[str] = None, keep_open: bool = True, ): """Open the video backend for reading. Args: filename: Optional new filename to set before opening. If provided, this will update the video's filename before opening the backend. dataset: Name of dataset in HDF5 file. grayscale: Whether to force grayscale. If None, autodetect on first frame load. keep_open: Whether to keep the video reader open between calls to read frames. If False, will close the reader after each call. If True (the default), it will keep the reader open and cache it for subsequent calls which may enhance the performance of reading multiple frames. ... """Also applies to: 253-255
tests/io/test_slp.py (1)
368-399
: Comprehensive test for video path resolutionThe
test_video_path_resolution
function effectively covers various scenarios for video path resolution, including:
- Resolving when the video is in the same directory as the labels file.
- Handling inaccessible video files.
The test is well-structured and uses appropriate file operations to simulate real-world scenarios.
To improve readability, consider adding comments before each test scenario to clearly separate and explain the different cases being tested.
Here's a suggested improvement for better readability:
def test_video_path_resolution(slp_real_data, tmp_path): # Initial setup labels = read_labels(slp_real_data) assert ( Path(labels.video.filename).as_posix() == "tests/data/videos/centered_pair_low_quality.mp4" ) shutil.copyfile(labels.video.filename, tmp_path / "centered_pair_low_quality.mp4") labels.video.replace_filename( "fake/path/to/centered_pair_low_quality.mp4", open=False ) labels.save(tmp_path / "labels.slp") # Scenario 1: Resolve when the same video filename is found in the labels directory labels = read_labels(tmp_path / "labels.slp") assert ( Path(labels.video.filename).as_posix() == (tmp_path / "centered_pair_low_quality.mp4").as_posix() ) assert labels.video.exists() # Scenario 2: Fail to resolve when the video file is inaccessible labels.video.replace_filename("new_fake/path/to/inaccessible.mp4", open=False) labels.save(tmp_path / "labels2.slp") shutil.copyfile( tmp_path / "centered_pair_low_quality.mp4", tmp_path / "inaccessible.mp4" ) Path(tmp_path / "inaccessible.mp4").chmod(0o000) labels = read_labels(tmp_path / "labels2.slp") assert not labels.video.exists() assert Path(labels.video.filename).as_posix() == "new_fake/path/to/inaccessible.mp4"These comments help to clearly delineate the different scenarios being tested, making the test function more readable and easier to understand.
sleap_io/io/slp.py (1)
Line range hint
59-143
: LGTM: Improved flexibility and error handling inmake_video
The changes to the
make_video
function are well-implemented. The newopen_backend
parameter provides more flexibility, and the improved error handling for file accessibility is a great addition. The integration of thesanitize_filename
function ensures consistent filename formatting.One minor suggestion:
Consider adding more specific exception handling for the video backend creation. Instead of catching a general
Exception
, you could catch specific exceptions that might occur during backend creation, such asIOError
orValueError
. This would provide more informative error messages and avoid masking unexpected exceptions.- except Exception: + except (IOError, ValueError) as e: + print(f"Error creating video backend: {e}") backend = Nonetests/model/test_video.py (1)
163-163
: Remove or utilize the unused variableimg
The variable
img
is assigned but never used. If the sole purpose is to trigger backend loading by accessing a frame, consider omitting the assignment or assigning to_
to indicate the intentional disregard of the value.Apply one of the following diffs to address the issue:
Option 1:
- img = video[0] + video[0]Option 2:
- img = video[0] + _ = video[0]🧰 Tools
🪛 Ruff
163-163: Local variable
img
is assigned to but never usedRemove assignment to unused variable
img
(F841)
sleap_io/io/nwb.py (2)
32-48
: Improve docstring for clarity and completenessConsider enhancing the docstring by providing more detailed descriptions for the parameters and return value. Specifically:
- In the
Args
section, clarify thatlabels
should be aLabels
object containing predicted instances.- In the
Returns
section, provide more precise information about the structure of the returnedpd.DataFrame
, including details about the hierarchical columns and index.
89-95
: Add comments to explain complex DataFrame transformationsThe chained pandas operations for transforming
labels_df
intolabels_tidy_df
involve several non-trivial steps. Adding inline comments to describe each operation (e.g.,set_index
,unstack
,swaplevel
,sort_index
) will improve code readability and help future maintainers understand the data manipulation process.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- docs/index.md (1 hunks)
- sleap_io/io/nwb.py (1 hunks)
- sleap_io/io/slp.py (10 hunks)
- sleap_io/io/utils.py (2 hunks)
- sleap_io/model/video.py (6 hunks)
- tests/io/test_slp.py (2 hunks)
- tests/model/test_video.py (3 hunks)
🧰 Additional context used
🪛 Ruff
sleap_io/io/slp.py
184-184: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
sleap_io/model/video.py
212-215: Use
return all(is_file_accessible(f) for f in self.filename)
instead offor
loopReplace with
return all(is_file_accessible(f) for f in self.filename)
(SIM110)
tests/io/test_slp.py
361-361: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
tests/model/test_video.py
96-96: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
154-154: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
163-163: Local variable
img
is assigned to but never usedRemove assignment to unused variable
img
(F841)
165-165: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
🔇 Additional comments (12)
docs/index.md (1)
137-138
: LGTM! This change aligns with the PR objectives.The addition of the
open_videos=False
parameter tosio.load_file()
provides explicit control over whether video files should be opened during label loading, as mentioned in the PR objectives. This enhancement improves flexibility and safety when working with label files, especially in scenarios where video files may not be accessible.sleap_io/model/video.py (3)
12-12
: LGTM: New import enhances file accessibility checks.The addition of
is_file_accessible
fromsleap_io.io.utils
aligns with the PR objectives of improving safety in video loading.
62-62
: LGTM: Updated post-init logic respectsopen_backend
flag.The modification to the
__attrs_post_init__
method ensures that the backend is only opened automatically ifopen_backend
is True and the video exists. This change aligns with the newopen_backend
parameter's functionality.
Line range hint
1-324
: Overall assessment: Excellent improvements to video loading safety and flexibility.The changes in this file significantly enhance the SLEAP library's video loading capabilities:
- The new
open_backend
parameter provides users with explicit control over backend initialization.- Enhanced error handling in the
__getitem__
method improves the user experience.- The
exists
method now usesis_file_accessible
, ensuring safer file checks.- The
open
method's newfilename
parameter increases flexibility in handling video files.These changes align well with the PR objectives of improving safety and flexibility in video loading. The minor suggestions provided in the review comments could further enhance the code's clarity and efficiency.
🧰 Tools
🪛 Ruff
212-215: Use
return all(is_file_accessible(f) for f in self.filename)
instead offor
loopReplace with
return all(is_file_accessible(f) for f in self.filename)
(SIM110)
sleap_io/io/slp.py (5)
43-57
: LGTM: Well-implemented filename sanitization functionThe
sanitize_filename
function is a great addition. It handles both single filenames and lists, ensures POSIX-compatible format, and maintains cross-platform compatibility. The implementation is clean and well-documented.
Line range hint
147-166
: LGTM: Consistent update toread_videos
functionThe changes to the
read_videos
function are well-implemented and consistent with the modifications made tomake_video
. The newopen_backend
parameter is correctly passed tomake_video
, and the default value ofTrue
maintains backward compatibility. The updated docstring accurately reflects the new functionality.
Line range hint
180-221
: LGTM: Consistent filename sanitization invideo_to_dict
The changes to the
video_to_dict
function are well-implemented. The use of thesanitize_filename
function ensures consistent filename formatting across different video types. This improvement enhances the robustness of the function and maintains consistency with the rest of the codebase.🧰 Tools
🪛 Ruff
184-184: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
Line range hint
1039-1080
: LGTM: Consistent update toread_labels
functionThe changes to the
read_labels
function are well-implemented and consistent with the modifications made toread_videos
. The newopen_videos
parameter is correctly passed toread_videos
, and the default value ofTrue
maintains backward compatibility. The updated docstring accurately reflects the new functionality.
Line range hint
1-1139
: Overall: Excellent improvements to SLEAP I/O functionalityThe changes in this file significantly enhance the flexibility and robustness of SLEAP I/O operations. Key improvements include:
- Addition of the
sanitize_filename
function for consistent filename handling.- New
open_backend
andopen_videos
parameters in various functions, allowing more control over video backend operations.- Improved error handling and file accessibility checks.
These changes maintain backward compatibility while adding valuable new features. The code style is consistent, and the documentation has been updated appropriately.
Great job on these improvements!
tests/model/test_video.py (1)
162-164
: Verify the effect of settingvideo.open_backend
post-initializationSetting
video.open_backend = True
after theVideo
object has been instantiated withopen_backend=False
may not reopen the backend automatically. Ensure that theVideo
class supports dynamic changes to theopen_backend
attribute, or explicitly callvideo.open()
to open the backend before accessing frames.🧰 Tools
🪛 Ruff
163-163: Local variable
img
is assigned to but never usedRemove assignment to unused variable
img
(F841)
sleap_io/io/utils.py (1)
6-7
: Imports for type annotations added correctlyThe addition of
Any
,Union
,Optional
, andPath
imports is appropriate for the type annotations used in the code. This enhances code clarity and aids in static type checking.sleap_io/io/nwb.py (1)
75-77
: Good error handling for empty predictionsThe check for an empty
data_list
and raising aValueError
if no predicted instances are found ensures that the function fails gracefully in the absence of data. This is a good practice for robust error handling.
def test_lazy_video_read(slp_real_data): | ||
labels = read_labels(slp_real_data) | ||
assert type(labels.video.backend) == MediaVideo | ||
assert labels.video.exists() | ||
|
||
labels = read_labels(slp_real_data, open_videos=False) | ||
assert labels.video.backend is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance test coverage and improve type checking
While the test function effectively verifies the behavior of the open_videos
parameter, consider the following improvements:
- Use
isinstance()
for type checking instead of direct type comparison. This adheres to Python best practices and is more flexible. - Add assertions to verify that labels are still correctly loaded when
open_videos=False
. This ensures that the lazy loading doesn't affect the integrity of the label data. - Consider using
assertIsNone()
for clarity in the second assertion.
Here's a suggested improvement:
def test_lazy_video_read(slp_real_data):
# Test default behavior
labels = read_labels(slp_real_data)
assert isinstance(labels.video.backend, MediaVideo)
assert labels.video.exists()
# Test lazy loading behavior
lazy_labels = read_labels(slp_real_data, open_videos=False)
assert lazy_labels.video.backend is None
# Verify that labels are still correctly loaded
assert len(lazy_labels) == len(labels)
assert lazy_labels.skeleton == labels.skeleton
# Add more assertions to verify the integrity of lazy-loaded labels
This refactored version improves type checking, clarifies the assertions, and adds a placeholder for additional verification of lazy-loaded labels.
🧰 Tools
🪛 Ruff
361-361: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
if video.backend is None: | ||
return {"filename": video.filename, "backend": video.backend_metadata} | ||
return {"filename": video_filename, "backend": video.backend_metadata} | ||
|
||
if type(video.backend) == MediaVideo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use isinstance()
for type checking
Instead of using type(video.backend) == MediaVideo
, it's more pythonic and flexible to use isinstance()
for type checking. This allows for subclass comparisons and is the recommended approach in Python.
Apply this change:
- if type(video.backend) == MediaVideo:
+ if isinstance(video.backend, MediaVideo):
Make similar changes for other type checks in this function (e.g., for HDF5Video
and ImageVideo
).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if type(video.backend) == MediaVideo: | |
if isinstance(video.backend, MediaVideo): |
🧰 Tools
🪛 Ruff
184-184: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
video.open(centered_pair_frame_paths) | ||
assert video.shape == (3, 384, 384, 1) | ||
assert type(video.backend) == ImageVideo | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify the usage of video.open()
with arguments
The video.open()
method is being called with centered_pair_frame_paths
as an argument. Ensure that the open()
method accepts centered_pair_frame_paths
as a parameter. If the intention is to replace the video source with new frame paths, consider using video.replace_filename(centered_pair_frame_paths)
followed by video.open()
.
🧰 Tools
🪛 Ruff
96-96: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
def test_open_backend_preference(centered_pair_low_quality_path): | ||
video = Video(centered_pair_low_quality_path) | ||
assert video.is_open | ||
assert type(video.backend) == MediaVideo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use isinstance()
for type checking
Consistent with best practices, use isinstance()
for type checking rather than direct type comparison with type()
.
Apply this diff to fix the issue:
- assert type(video.backend) == MediaVideo
+ assert isinstance(video.backend, MediaVideo)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
assert type(video.backend) == MediaVideo | |
assert isinstance(video.backend, MediaVideo) |
🧰 Tools
🪛 Ruff
154-154: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
video.open_backend = True | ||
img = video[0] | ||
assert video.is_open | ||
assert type(video.backend) == MediaVideo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use isinstance()
for type checking
As with previous instances, prefer using isinstance()
for type checking to handle inheritance correctly and enhance code clarity.
Apply this diff to fix the issue:
- assert type(video.backend) == MediaVideo
+ assert isinstance(video.backend, MediaVideo)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
assert type(video.backend) == MediaVideo | |
assert isinstance(video.backend, MediaVideo) |
🧰 Tools
🪛 Ruff
165-165: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
@@ -91,6 +91,10 @@ def test_video_open_close(centered_pair_low_quality_path): | |||
video.open(grayscale=True) | |||
assert video.shape == (1100, 384, 384, 1) | |||
|
|||
video.open(centered_pair_frame_paths) | |||
assert video.shape == (3, 384, 384, 1) | |||
assert type(video.backend) == ImageVideo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use isinstance()
for type checking
For type checking, it's recommended to use isinstance()
instead of comparing types directly. This provides better support for inheritance and improves code readability.
Apply this diff to fix the issue:
- assert type(video.backend) == ImageVideo
+ assert isinstance(video.backend, ImageVideo)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
assert type(video.backend) == ImageVideo | |
assert isinstance(video.backend, ImageVideo) |
🧰 Tools
🪛 Ruff
96-96: Use
is
andis not
for type comparisons, orisinstance()
for isinstance checks(E721)
def is_file_accessible(filename: str | Path) -> bool: | ||
"""Check if a file is accessible. | ||
|
||
Args: | ||
labels: A general label object. | ||
filename: Path to a file. | ||
|
||
Returns: | ||
pd.DataFrame: A pandas data frame with the structured data with | ||
hierarchical columns. The column hierarchy is: | ||
"video_path", | ||
"skeleton_name", | ||
"track_name", | ||
"node_name", | ||
And it is indexed by the frames. | ||
|
||
Raises: | ||
ValueError: If no frames in the label objects contain predicted instances. | ||
`True` if the file is accessible, `False` otherwise. | ||
|
||
Notes: | ||
This checks if the file readable by the current user by reading one byte from | ||
the file. | ||
""" | ||
# Form pairs of labeled_frames and predicted instances | ||
labeled_frames = labels.labeled_frames | ||
all_frame_instance_tuples: Generator[ | ||
tuple[LabeledFrame, PredictedInstance], None, None | ||
] = ( | ||
(label_frame, instance) # type: ignore | ||
for label_frame in labeled_frames | ||
for instance in label_frame.predicted_instances | ||
) | ||
|
||
# Extract the data | ||
data_list = list() | ||
for labeled_frame, instance in all_frame_instance_tuples: | ||
# Traverse the nodes of the instances's skeleton | ||
skeleton = instance.skeleton | ||
for node in skeleton.nodes: | ||
row_dict = dict( | ||
frame_idx=labeled_frame.frame_idx, | ||
x=instance.points[node].x, | ||
y=instance.points[node].y, | ||
score=instance.points[node].score, # type: ignore[attr-defined] | ||
node_name=node.name, | ||
skeleton_name=skeleton.name, | ||
track_name=instance.track.name if instance.track else "untracked", | ||
video_path=labeled_frame.video.filename, | ||
) | ||
data_list.append(row_dict) | ||
|
||
if not data_list: | ||
raise ValueError("No predicted instances found in labels object") | ||
|
||
labels_df = pd.DataFrame(data_list) | ||
|
||
# Reformat the data with columns for dict-like hierarchical data access. | ||
index = [ | ||
"skeleton_name", | ||
"track_name", | ||
"node_name", | ||
"video_path", | ||
"frame_idx", | ||
] | ||
|
||
labels_tidy_df = ( | ||
labels_df.set_index(index) | ||
.unstack(level=[0, 1, 2, 3]) | ||
.swaplevel(0, -1, axis=1) # video_path on top while x, y score on bottom | ||
.sort_index(axis=1) # Better format for columns | ||
.sort_index(axis=0) # Sorts by frames | ||
) | ||
|
||
return labels_tidy_df | ||
filename = Path(filename) | ||
try: | ||
with open(filename, "rb") as f: | ||
f.read(1) | ||
return True | ||
except (FileNotFoundError, PermissionError, OSError, ValueError): | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Simplify exception handling by catching OSError
Since FileNotFoundError
, PermissionError
, and ValueError
are subclasses or related to OSError
, you can simplify the exception handling by catching OSError
alone.
Apply this diff to simplify the exception handling:
def is_file_accessible(filename: str | Path) -> bool:
...
try:
with open(filename, "rb") as f:
f.read(1)
return True
- except (FileNotFoundError, PermissionError, OSError, ValueError):
+ except OSError:
return False
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def is_file_accessible(filename: str | Path) -> bool: | |
"""Check if a file is accessible. | |
Args: | |
labels: A general label object. | |
filename: Path to a file. | |
Returns: | |
pd.DataFrame: A pandas data frame with the structured data with | |
hierarchical columns. The column hierarchy is: | |
"video_path", | |
"skeleton_name", | |
"track_name", | |
"node_name", | |
And it is indexed by the frames. | |
Raises: | |
ValueError: If no frames in the label objects contain predicted instances. | |
`True` if the file is accessible, `False` otherwise. | |
Notes: | |
This checks if the file readable by the current user by reading one byte from | |
the file. | |
""" | |
# Form pairs of labeled_frames and predicted instances | |
labeled_frames = labels.labeled_frames | |
all_frame_instance_tuples: Generator[ | |
tuple[LabeledFrame, PredictedInstance], None, None | |
] = ( | |
(label_frame, instance) # type: ignore | |
for label_frame in labeled_frames | |
for instance in label_frame.predicted_instances | |
) | |
# Extract the data | |
data_list = list() | |
for labeled_frame, instance in all_frame_instance_tuples: | |
# Traverse the nodes of the instances's skeleton | |
skeleton = instance.skeleton | |
for node in skeleton.nodes: | |
row_dict = dict( | |
frame_idx=labeled_frame.frame_idx, | |
x=instance.points[node].x, | |
y=instance.points[node].y, | |
score=instance.points[node].score, # type: ignore[attr-defined] | |
node_name=node.name, | |
skeleton_name=skeleton.name, | |
track_name=instance.track.name if instance.track else "untracked", | |
video_path=labeled_frame.video.filename, | |
) | |
data_list.append(row_dict) | |
if not data_list: | |
raise ValueError("No predicted instances found in labels object") | |
labels_df = pd.DataFrame(data_list) | |
# Reformat the data with columns for dict-like hierarchical data access. | |
index = [ | |
"skeleton_name", | |
"track_name", | |
"node_name", | |
"video_path", | |
"frame_idx", | |
] | |
labels_tidy_df = ( | |
labels_df.set_index(index) | |
.unstack(level=[0, 1, 2, 3]) | |
.swaplevel(0, -1, axis=1) # video_path on top while x, y score on bottom | |
.sort_index(axis=1) # Better format for columns | |
.sort_index(axis=0) # Sorts by frames | |
) | |
return labels_tidy_df | |
filename = Path(filename) | |
try: | |
with open(filename, "rb") as f: | |
f.read(1) | |
return True | |
except (FileNotFoundError, PermissionError, OSError, ValueError): | |
return False | |
def is_file_accessible(filename: str | Path) -> bool: | |
"""Check if a file is accessible. | |
Args: | |
filename: Path to a file. | |
Returns: | |
`True` if the file is accessible, `False` otherwise. | |
Notes: | |
This checks if the file readable by the current user by reading one byte from | |
the file. | |
""" | |
filename = Path(filename) | |
try: | |
with open(filename, "rb") as f: | |
f.read(1) | |
return True | |
except OSError: | |
return False |
🛠️ Refactor suggestion
Suggestion: Use os.access
to check file readability
Consider using os.access(filename, os.R_OK)
to check if the file is readable by the current user without opening it. This method is more efficient and avoids potential side effects from opening files.
Apply this diff to refactor the function:
+import os
def is_file_accessible(filename: str | Path) -> bool:
"""Check if a file is accessible.
Args:
filename: Path to a file.
Returns:
`True` if the file is accessible, `False` otherwise.
Notes:
- This checks if the file readable by the current user by reading one byte from
+ This checks if the file is readable by the current user using os.access.
the file.
"""
filename = Path(filename)
- try:
- with open(filename, "rb") as f:
- f.read(1)
- return True
- except (FileNotFoundError, PermissionError, OSError, ValueError):
- return False
+ return os.access(filename, os.R_OK)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def is_file_accessible(filename: str | Path) -> bool: | |
"""Check if a file is accessible. | |
Args: | |
labels: A general label object. | |
filename: Path to a file. | |
Returns: | |
pd.DataFrame: A pandas data frame with the structured data with | |
hierarchical columns. The column hierarchy is: | |
"video_path", | |
"skeleton_name", | |
"track_name", | |
"node_name", | |
And it is indexed by the frames. | |
Raises: | |
ValueError: If no frames in the label objects contain predicted instances. | |
`True` if the file is accessible, `False` otherwise. | |
Notes: | |
This checks if the file readable by the current user by reading one byte from | |
the file. | |
""" | |
# Form pairs of labeled_frames and predicted instances | |
labeled_frames = labels.labeled_frames | |
all_frame_instance_tuples: Generator[ | |
tuple[LabeledFrame, PredictedInstance], None, None | |
] = ( | |
(label_frame, instance) # type: ignore | |
for label_frame in labeled_frames | |
for instance in label_frame.predicted_instances | |
) | |
# Extract the data | |
data_list = list() | |
for labeled_frame, instance in all_frame_instance_tuples: | |
# Traverse the nodes of the instances's skeleton | |
skeleton = instance.skeleton | |
for node in skeleton.nodes: | |
row_dict = dict( | |
frame_idx=labeled_frame.frame_idx, | |
x=instance.points[node].x, | |
y=instance.points[node].y, | |
score=instance.points[node].score, # type: ignore[attr-defined] | |
node_name=node.name, | |
skeleton_name=skeleton.name, | |
track_name=instance.track.name if instance.track else "untracked", | |
video_path=labeled_frame.video.filename, | |
) | |
data_list.append(row_dict) | |
if not data_list: | |
raise ValueError("No predicted instances found in labels object") | |
labels_df = pd.DataFrame(data_list) | |
# Reformat the data with columns for dict-like hierarchical data access. | |
index = [ | |
"skeleton_name", | |
"track_name", | |
"node_name", | |
"video_path", | |
"frame_idx", | |
] | |
labels_tidy_df = ( | |
labels_df.set_index(index) | |
.unstack(level=[0, 1, 2, 3]) | |
.swaplevel(0, -1, axis=1) # video_path on top while x, y score on bottom | |
.sort_index(axis=1) # Better format for columns | |
.sort_index(axis=0) # Sorts by frames | |
) | |
return labels_tidy_df | |
filename = Path(filename) | |
try: | |
with open(filename, "rb") as f: | |
f.read(1) | |
return True | |
except (FileNotFoundError, PermissionError, OSError, ValueError): | |
return False | |
import os | |
def is_file_accessible(filename: str | Path) -> bool: | |
"""Check if a file is accessible. | |
Args: | |
filename: Path to a file. | |
Returns: | |
`True` if the file is accessible, `False` otherwise. | |
Notes: | |
This checks if the file is readable by the current user using os.access. | |
""" | |
filename = Path(filename) | |
return os.access(filename, os.R_OK) |
(label_frame, instance) # type: ignore | ||
for label_frame in labeled_frames | ||
for instance in label_frame.predicted_instances | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Resolve type checking issues instead of using # type: ignore
The use of # type: ignore
suggests that the type checker is encountering issues with the generator expression. To improve code quality and maintainability, consider updating the type annotations or refactoring the code to address the underlying type errors instead of suppressing them.
frame_idx=labeled_frame.frame_idx, | ||
x=instance.points[node].x, | ||
y=instance.points[node].y, | ||
score=instance.points[node].score, # type: ignore[attr-defined] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid suppressing type checker warnings with # type: ignore[attr-defined]
Using # type: ignore[attr-defined]
indicates that the type checker does not recognize the score
attribute on instance.points[node]
. To enhance type safety and code clarity, consider updating the type annotations for instance.points[node]
to include the score
attribute, or ensure that the Point
class has the score
attribute properly defined in its type hints.
def convert_predictions_to_dataframe(labels: Labels) -> pd.DataFrame: | ||
"""Convert predictions data to a Pandas dataframe. | ||
|
||
Args: | ||
labels: A general label object. | ||
|
||
Returns: | ||
pd.DataFrame: A pandas data frame with the structured data with | ||
hierarchical columns. The column hierarchy is: | ||
"video_path", | ||
"skeleton_name", | ||
"track_name", | ||
"node_name", | ||
And it is indexed by the frames. | ||
|
||
Raises: | ||
ValueError: If no frames in the label objects contain predicted instances. | ||
""" | ||
# Form pairs of labeled_frames and predicted instances | ||
labeled_frames = labels.labeled_frames | ||
all_frame_instance_tuples = ( | ||
(label_frame, instance) # type: ignore | ||
for label_frame in labeled_frames | ||
for instance in label_frame.predicted_instances | ||
) | ||
|
||
# Extract the data | ||
data_list = list() | ||
for labeled_frame, instance in all_frame_instance_tuples: | ||
# Traverse the nodes of the instances's skeleton | ||
skeleton = instance.skeleton | ||
for node in skeleton.nodes: | ||
row_dict = dict( | ||
frame_idx=labeled_frame.frame_idx, | ||
x=instance.points[node].x, | ||
y=instance.points[node].y, | ||
score=instance.points[node].score, # type: ignore[attr-defined] | ||
node_name=node.name, | ||
skeleton_name=skeleton.name, | ||
track_name=instance.track.name if instance.track else "untracked", | ||
video_path=labeled_frame.video.filename, | ||
) | ||
data_list.append(row_dict) | ||
|
||
if not data_list: | ||
raise ValueError("No predicted instances found in labels object") | ||
|
||
labels_df = pd.DataFrame(data_list) | ||
|
||
# Reformat the data with columns for dict-like hierarchical data access. | ||
index = [ | ||
"skeleton_name", | ||
"track_name", | ||
"node_name", | ||
"video_path", | ||
"frame_idx", | ||
] | ||
|
||
labels_tidy_df = ( | ||
labels_df.set_index(index) | ||
.unstack(level=[0, 1, 2, 3]) | ||
.swaplevel(0, -1, axis=1) # video_path on top while x, y score on bottom | ||
.sort_index(axis=1) # Better format for columns | ||
.sort_index(axis=0) # Sorts by frames | ||
) | ||
|
||
return labels_tidy_df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Include unit tests for convert_predictions_to_dataframe
To ensure that the new convert_predictions_to_dataframe
function behaves as expected, consider adding unit tests that cover various scenarios, including cases with:
- Multiple videos, skeletons, tracks, and nodes.
- Missing or empty
predicted_instances
. - Instances without associated tracks (i.e.,
instance.track is None
).
This will help detect any potential issues early and ensure robustness.
This PR adds more flexibility and a safer path for loading SLP files in cases where the video files might not be accessible.
PermissionError
when loading a .slp file pointing to a video in /root #116).sio.load_slp(..., open_videos=False)
Video
objects withVideo(..., open_backend=False)
.sio.io.utils.is_file_accessible
to check for readability by actually reading a byte. This catches permission and other esoteric filesystem errors.