Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSE-10][external] Fixes for LocalDataset object instantiation #704

Closed
wants to merge 8 commits into from

Conversation

JBWilkie
Copy link
Collaborator

Problem

  • 1: When dataset items in folders are pulled in a flat structure, LocalDataset instantiation and the get_annotations() function doesn't work because we are constructing & checking image paths on the assumption they were pulled with folders
  • 2: Changes introduced in IO-1445 broke Darwin JSON 1.0 compatibility with LocalDataset objects & get_annotations() in some scenarios

Much more problem detail is available in a comment in CSE-10

Solution

  • 1: When instantiating a LocalDataset object or using get_annotations(), check if the release was pulled with folders and pass this down to the image path construction
  • 2: When instantiating a LocalDataset object or using get_annotations(), check if the release is Darwin JSON 1.0 or 2.0, then pass this down to the relevant function(s)

Much more solution detail is available in a comment in CSE-10

Changelog

  • Fixed false assumption that dataset releases with folders are always pulled with folders. This assumption led to issues creating LocalDataset objects and with the use of the get_annotations() function
  • Fixed issue where Darwin JSON 1.0 was not compatible with LocalDataset objects of the get_annotations() function

@linear
Copy link

linear bot commented Oct 29, 2023

CSE-10 Multiple stems error when using get_annotations()

Client has a workaround so this is low priority, but needs investigation and PR if necessary

Originally reported in this Intercom conversation

@JBWilkie
Copy link
Collaborator Author

Note: This has failing tests at the moment, I need to:

  • 1: Resolve them - There appears to be an issue where one test is loading a V2 dataset of 20 items, but one of the items is actually a V1 item
  • 2: Resolve an issue with json_stream for video files with large frame_url fields. There appears to be an issue where if frame_url exceeds a certain size, then the stream breaks

I will raise this with someone soon to unblock the work

darwin/client.py Outdated
@@ -908,7 +908,7 @@ def move_to_stage(
dataset_slug: str,
team_slug: str,
filters: Dict[str, UnknownType],
stage_id: int,
stage_id: str,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the original ticket, but stage_id is str

@JBWilkie JBWilkie closed this Dec 12, 2023
@JBWilkie JBWilkie deleted the cse-10 branch December 12, 2023 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant