Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Error #26

Open
Masrur02 opened this issue Aug 20, 2024 · 6 comments
Open

Assertion Error #26

Masrur02 opened this issue Aug 20, 2024 · 6 comments

Comments

@Masrur02
Copy link

I have modifed the python grounded_sam2_local_demo.py file for predicting from a video file. I found that, grounding_dino/grounddino/utils/inference.py has a function

def load_image(image_path: str) -> Tuple[np.array, torch.Tensor]:
    transform = T.Compose(
        [
            T.RandomResize([800], max_size=1333),
            T.ToTensor(),
            T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
    image_source = Image.open(image_path).convert("RGB")
    image = np.asarray(image_source)
    image_transformed, _ = transform(image_source, None)
    return image, image_transformed

Here, the height of the image is resized to 800 and the maximum width is to 1333. However, if I changed the height to 400 and the max_size to 600 (it maintains the aspect ratio), I get an error like this

aa
I reduced the image size to get a higher FPS. How can I solve this issue? Moreover is there any other way to increase the FPS?

TIA

@rentainhe
Copy link
Collaborator

rentainhe commented Aug 21, 2024

Dear @Masrur02

I think this issue is very similar to #10

Would you like to check the grounding results to see if there are grounding results output or not

@ZhangT-tech
Copy link

It worked when you add a '.' after your object, so instead of TEXT_PTOMPT="bird", it should be "bird."

@SJP2022
Copy link

SJP2022 commented Sep 25, 2024

It worked when you add a '.' after your object, so instead of TEXT_PTOMPT="bird", it should be "bird."

@ZhangT-tech @rentainhe
The annotation # VERY important: text queries need to be lowercased + end with a dot is in the code.
The dot successfully solve this problem, but I wonder why, and do you have any insights? Thanks a lot!

@ZhangT-tech
Copy link

ZhangT-tech commented Sep 26, 2024

The actual reason of the assertion error happened is because the model didn't detect anything in the video/img, the original code repo didn't differentiate it, you can manually add a if statement to skip this video:

if input_boxes.size == 0:
        print("No objects detected, skipping this video.")
        print(f"The video is from {SOURCE_VIDEO_FRAME_DIR}")
        return True
    print(input_boxes)

And in terms of the '.', it is just format thing that implicitly in their code, like a End of Token, to separate the objects you want to segment. For example in their grounded_sam2_gd1.5_demo.py, they use '.' to separate 2 objects.

@SJP2022
Copy link

SJP2022 commented Sep 26, 2024

I get it, thank you for your explanation!

@garychan22
Copy link

I have encountered this issue when directly running the sample code grounded_sam2_local_demo.py, how can i resolve this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants