You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From here (#4 (comment)),
I figured that you've brought the DETR weights trained on COCO dataset and re-trained it on AVA to detect human instances.
Could you describe this process in a more detailed way? (e.g., how did you manipulated the DETR structure to only detect human, what exactly was the input, position embedding, ... etc)
Was your intention of this pretraining to make queries focus more on classification after DETR architecture of TubeR learns how to localize actors well enough?
Have you tried training the whole architecture without the pretrained DETR weights? I've tried several times but could not find a good configuration to make the actual learning happen.
Thanks in advance.
The text was updated successfully, but these errors were encountered:
Thanks for the impressive work.
I have one question about the pretraining process of DETR (of which you've mentioned here: https://github.com/amazon-science/tubelet-transformer#training)
From here (#4 (comment)),
I figured that you've brought the DETR weights trained on COCO dataset and re-trained it on AVA to detect human instances.
Thanks in advance.
The text was updated successfully, but these errors were encountered: