-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing part and confusing place #47
Comments
After working on this project since June 2024, I feel like they have manipulated the InterCLIP checkpoints. I cannot achieve the same evaluation results as those reported in the paper. There is something you can check in your dataset. You will find a lot of duplicates, which they CLAIM: "Although the textual annotations are similar (since the semantic category of these motions is 'dance'), each motion captured is unique. This does not limit but rather enhances diversity. For example, the diffusion model inherently has the capability to model such diversity effectively. Hence, similar annotations in this context are not a problem but an opportunity to refine the model's ability to generate nuanced variations of similar actions." Dancing : 50 sequences
taichi : 17 sequences
sparring : 28 sequences
rock-paper-scissors : 4 sequences
I trained same as epoch they mentions, but the result is totally different. Dear Author, can you clarify these duplications again? How you have 10 actions in one similar sentence. If this happen for real then it means that the model can generate the 10 actions of dancing, right? then how could we proof accurate like you did. |
The rendered video in my side is totally white video. Do you know how to fix this? |
Hi, thanks for your interest in our work! |
Hi, In addition, the rendered videos are all white, do you have any idea about this? |
|
Hi, really thanks and it works now. One more question: May you share the dataset visualization code? Or would you mind share the link of other project that can directly visualize your dataset? Dataset is really impressive and useful. |
you can use this visualization code :) |
Hi,
In training script, you use InterGen (train.py, def build_models: ...)
But in eval script, you use InterClip (evaluator.py, def build_models: .... ) which contains "motion_encoder". This cause the mismatched params. Can you help me explain this?
The text was updated successfully, but these errors were encountered: