-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training data for reference_video #45
Comments
Sorry for the late response. I had prepared one episode of 18 animation titles (about 540=30x18 min) from official channels on YouTube. |
Thank you for your response. Do you mean that you downloaded 18 animation episodes, each 30 minutes long, from YouTube, and then converted them into frames to extract sketches? May I ask which official channels these were from? Additionally, you mentioned storing the data as distance field images; could you explain how these are obtained? |
Yes. They were originally from the channel, but they seem to have stopped providing. I think that any 18 animation episodes are OK as far as they are from different animations to make sure the robustness of the model.
Please refer to #14 |
I would like to ask if you are looking for 18 videos, each 30 minutes long? Also, what fps did you set when converting them into frames? If it's 30 frames per second, then a 30-minute video would turn into 54000 frames. Do you then put all these 54000 frames into anime_dir? Wouldn't that be too many? Because it seems from the paper that they only use very short videos |
Yes about all the questions. I used 30 fps.
Good question. I wanted to use as many datasets as possible to improve the generalizability when I conducted experiments, and I did not training the model with a variation of number of datasets. Therefore, I do not have solid answers for that question. But as you said, it would be too many to train the model. |
OK! Thank you for your response, I will give it a try. |
Hello, I have a few questions regarding training:
|
I am really sorry for the response. First, I need to mention that I am not referring to the original paper rigorously. I just borrowed the ideas from Method (Section 3) and shot selection (former part of Section 4.1). So, I have not been careful about the dataset selection.
|
Thank you very much for your response. I have trained the model myself, and the ctn part seems to be fine.(picture1) However, the visualize images produced during the training of tcn are showing up in gray. I have written some testing code and it outputs a similar result.(picture 2) Do you have any idea what might be the problem? |
As I mentioned before, it seems that there is an issue with the code for TCN. The visualized images saved during training, when processed through TCN, result in outputs that are all in gray. However, this is normal with CTN. I would like to inquire whether I made a mistake in my operation or if this behavior is expected. |
I am really sorry for the late response. I do not have the solid answer for your question. I confirmed that the training TCN was unstable, and changing the hyperparameter (batch size and learning rate) led to the stable behavior. Could you try to increase batch size or decrease learning rate? If this does not work, I do not have any ideas. |
Increasing my batch size causes CUDA out of memory:( I will try reducing the learning rate to see if it helps. Thank you for your suggestion. However, I want to confirm, when you say 'unstable', are you referring to the situation with the gray images? |
Yes. Situation with the grey images would be a result of mode collapse, so the trained generator has found an easy solution. You might be required to add the regularization loss term to avoid the mode collapse. If decreasing learning rate does not work, it might help. |
Hello, your work is excellent! I want to train the 'reference_video' model, but you only provided the method for placing the data. May I ask what dataset you used? And where can I download it?
The text was updated successfully, but these errors were encountered: