-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DAB-DETR Object detection/segmentation model #30803
Conversation
Hi @conditionedstimulus, thanks for opening a PR! Just skimming over the modeling files, it looks like all of the modules are copied from, or can be copied from conditional DETR. Are there any architectural changes this model brings? If not, then all we need to do is convert the checkpoints and upload those to the hub such that they can be loaded in ConditionalDETR directly |
Hi Amy, I attached a photo comparing the cross-attention of the decoder in DETR, Conditional DETR, and DAB DETR, as this is the main architectural difference. I copied the code from Conditional DETR because this model is an extension/evolved version of Conditional DETR. I believe it would be cool and useful to include this model in the HF object detection collection. |
@conditionedstimulus Thanks for sharing! OK, seems useful to have this available as an option as part of the DETR family in the library. Feel free to ping me when the PR is ready for review. cc @qubvel for reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing tests! While I asking the team to move checkpoints to the org, can you please update the last thing (I hope 😄)
self.assertEqual(len(results["scores"]), 5) | ||
self.assertTrue(torch.allclose(results["scores"], expected_scores, atol=1e-4)) | ||
self.assertSequenceEqual(results["labels"].tolist(), expected_labels) | ||
self.assertTrue(torch.allclose(results["boxes"][0, :], expected_boxes, atol=1e-4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope the last thing! Can you please update to use torch.testing.assert_close
instead of self.assertTrue(torch.allclose(...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in other places in tests, for example:
https://github.com/huggingface/transformers/pull/35903/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, no problem I changed and ran the test w my model source. Is it enough to change only in that part of the tests or it should be in the whole file? Also apprx. how much time it's gonna take to move the model cards?:)
thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other tests as well in tests/models/dab_detr
folder, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transfer should not take more than a few hours, just need review from Arthur once again to get his approval
Noticed we don't have approval from @ArthurZucker, waiting for his review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few super small comments! Thanks for your patience! 🤗
h = [hidden_dim] * (num_layers - 1) | ||
self.layers = nn.ModuleList(nn.Linear(n, k) for n, k in zip([input_dim] + h, h + [output_dim])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no what I mean is we should only create n, k) for n, k in zip([input_dim] + h, h + [output_dim]
in the config. then you know exactly in and out that should be used for the linear layers.
Hi @ArthurZucker and @qubvel, I’ve made most of the required modifications. Where I didn’t, I left comments on your feedback. Thanks! |
@conditionedstimulus Thanks for the updates! Please update converted weights for other checkpoints on the Hub as well and I will ask for transfer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go! 🚀
hidden_states = self.layernorm(hidden_states) | ||
intermediate.pop() | ||
intermediate.append(hidden_states) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intermediate_state = self.layernorm(hidden_states)
intermediate.append(intermediate_states)
`
vs
`intermediate.append(self.layernorm(hidden_states))`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will avoid this ugly pop append
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the list manipulation entirely. I didn’t revisit the original code, but as I recall, this was part of a conditional section. Since we removed many configurations, the list manipulation remained unchanged—popping the last element and appending the same value back. So, I only kept the hidden states layer normalization.
Hi @ArthurZucker and @qubvel, I’ve finalized the last modification—if I understand correctly, this should be the final version, and we’ll roll it out soon.
Thanks, for your review, guidance, and support! :) Looking forward to the merge! 🤗 |
run-slow: dab_detr |
This comment contains run-slow, running the specified jobs: ['models/dab_detr'] ... |
run-slow: dab_detr |
This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs: models: ['models/dab_detr'] |
run-slow: dab_detr |
This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs: models: ['models/dab_detr'] |
run-slow: dab_detr |
This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs: models: ['models/dab_detr'] |
@conditionedstimulus Congratulations on merging the model! 🎉 It was a long journey, and we really appreciate you were able to finish it 💪 . Thank you for your contribution, and sorry for the delays on our side. Great job! 🚀 And feel free to share your achievement on social networks, we’d be happy to amplify it! |
Thank you guys! |
What does this PR do?
Add DAB-DETR Object detection model. Paper: https://arxiv.org/abs/2201.12329
Original code repo: https://github.com/IDEA-Research/DAB-DETR
Fixes # (issue)
[WIP] This model is part of how DETR models have evolved, alongside DN DETR (not part of this PR), to pave the way for newer and better models like Dino and Stable Dino in object detection
Who can review?
@amyeroberts