Training data for reference_video #45

jamie212 · 2023-10-25T13:09:49Z

Hello, your work is excellent! I want to train the 'reference_video' model, but you only provided the method for placing the data. May I ask what dataset you used? And where can I download it?

SerialLain3170 · 2023-11-07T04:44:11Z

Sorry for the late response. I had prepared one episode of 18 animation titles (about 540=30x18 min) from official channels on YouTube.

jamie212 · 2023-11-07T08:44:49Z

Sorry for the late response. I had prepared one episode of 18 animation titles (about 540=30x18 min) from official channels on YouTube.

Thank you for your response. Do you mean that you downloaded 18 animation episodes, each 30 minutes long, from YouTube, and then converted them into frames to extract sketches? May I ask which official channels these were from? Additionally, you mentioned storing the data as distance field images; could you explain how these are obtained?

SerialLain3170 · 2023-11-09T06:49:41Z

Thank you for your response. Do you mean that you downloaded 18 animation episodes, each 30 minutes long, from YouTube, and then converted them into frames to extract sketches?

Yes. They were originally from the channel, but they seem to have stopped providing. I think that any 18 animation episodes are OK as far as they are from different animations to make sure the robustness of the model.

Additionally, you mentioned storing the data as distance field images; could you explain how these are obtained?

Please refer to #14

jamie212 · 2023-11-13T13:00:09Z

Thank you for your response. Do you mean that you downloaded 18 animation episodes, each 30 minutes long, from YouTube, and then converted them into frames to extract sketches?

Yes. They were originally from the channel, but they seem to have stopped providing. I think that any 18 animation episodes are OK as far as they are from different animations to make sure the robustness of the model.

Additionally, you mentioned storing the data as distance field images; could you explain how these are obtained?

Please refer to #14

I would like to ask if you are looking for 18 videos, each 30 minutes long? Also, what fps did you set when converting them into frames? If it's 30 frames per second, then a 30-minute video would turn into 54000 frames. Do you then put all these 54000 frames into anime_dir? Wouldn't that be too many? Because it seems from the paper that they only use very short videos

SerialLain3170 · 2023-11-13T19:26:36Z

I would like to ask if you are looking for 18 videos, each 30 minutes long? Also, what fps did you set when converting them into frames? If it's 30 frames per second, then a 30-minute video would turn into 54000 frames. Do you then put all these 54000 frames into anime_dir?

Yes about all the questions. I used 30 fps.

Wouldn't that be too many? Because it seems from the paper that they only use very short videos

Good question. I wanted to use as many datasets as possible to improve the generalizability when I conducted experiments, and I did not training the model with a variation of number of datasets. Therefore, I do not have solid answers for that question. But as you said, it would be too many to train the model.

jamie212 · 2023-11-14T05:32:05Z

Good question. I wanted to use as many datasets as possible to improve the generalizability when I conducted experiments, and I did not training the model with a variation of number of datasets. Therefore, I do not have solid answers for that question. But as you said, it would be too many to train the model.

OK! Thank you for your response, I will give it a try.

jamie212 · 2023-12-06T13:55:14Z

I would like to ask if you are looking for 18 videos, each 30 minutes long? Also, what fps did you set when converting them into frames? If it's 30 frames per second, then a 30-minute video would turn into 54000 frames. Do you then put all these 54000 frames into anime_dir?

Yes about all the questions. I used 30 fps.

Wouldn't that be too many? Because it seems from the paper that they only use very short videos

Good question. I wanted to use as many datasets as possible to improve the generalizability when I conducted experiments, and I did not training the model with a variation of number of datasets. Therefore, I do not have solid answers for that question. But as you said, it would be too many to train the model.

Hello, I have a few questions regarding training:

Did you put folders from 18 different anime into the DATA_PATH? I'm asking because it seems from the paper that only data from one anime was used for training, and data from other anime were used for testing. I just want to confirm.
Could you please explain what 'validsize' and 'anime_dir' in the param.yaml file are used for, and what should they be set to?
In your code, is there a testing process included, or do I need to write my own code for inference?

SerialLain3170 · 2023-12-18T00:22:49Z

I am really sorry for the response. First, I need to mention that I am not referring to the original paper rigorously. I just borrowed the ideas from Method (Section 3) and shot selection (former part of Section 4.1). So, I have not been careful about the dataset selection.

Yes. anime_dir parameter in param.yaml is used to set what animation (directory name in DATA_PATH) is utilized to train. If you have 10 animations for training and 3 datasets for testing, you should include 10 animations in anime_dir. For the 3 datasets for testing, you are required to write your own code to infer.
validsize is the batch size during the validation. Sorry for confusion because it is very close to valid_size.
It is already mentioned in 1, you need to write your own code for the inference.

jamie212 · 2023-12-19T08:07:02Z

I am really sorry for the response. First, I need to mention that I am not referring to the original paper rigorously. I just borrowed the ideas from Method (Section 3) and shot selection (former part of Section 4.1). So, I have not been careful about the dataset selection.

Yes. anime_dir parameter in param.yaml is used to set what animation (directory name in DATA_PATH) is utilized to train. If you have 10 animations for training and 3 datasets for testing, you should include 10 animations in anime_dir. For the 3 datasets for testing, you are required to write your own code to infer.

validsize is the batch size during the validation. Sorry for confusion because it is very close to valid_size.

It is already mentioned in 1, you need to write your own code for the inference.

Thank you very much for your response. I have trained the model myself, and the ctn part seems to be fine.(picture1) However, the visualize images produced during the training of tcn are showing up in gray. I have written some testing code and it outputs a similar result.(picture 2) Do you have any idea what might be the problem?

jamie212 · 2023-12-31T07:40:03Z

As I mentioned before, it seems that there is an issue with the code for TCN. The visualized images saved during training, when processed through TCN, result in outputs that are all in gray. However, this is normal with CTN. I would like to inquire whether I made a mistake in my operation or if this behavior is expected.

SerialLain3170 · 2024-01-02T21:15:27Z

I am really sorry for the late response. I do not have the solid answer for your question. I confirmed that the training TCN was unstable, and changing the hyperparameter (batch size and learning rate) led to the stable behavior. Could you try to increase batch size or decrease learning rate? If this does not work, I do not have any ideas.

jamie212 · 2024-01-03T13:20:06Z

I am really sorry for the late response. I do not have the solid answer for your question. I confirmed that the training TCN was unstable, and changing the hyperparameter (batch size and learning rate) led to the stable behavior. Could you try to increase batch size or decrease learning rate? If this does not work, I do not have any ideas.

Increasing my batch size causes CUDA out of memory:( I will try reducing the learning rate to see if it helps. Thank you for your suggestion. However, I want to confirm, when you say 'unstable', are you referring to the situation with the gray images?

SerialLain3170 · 2024-01-03T20:43:25Z

However, I want to confirm, when you say 'unstable', are you referring to the situation with the gray images?

Yes. Situation with the grey images would be a result of mode collapse, so the trained generator has found an easy solution. You might be required to add the regularization loss term to avoid the mode collapse. If decreasing learning rate does not work, it might help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training data for reference_video #45

Training data for reference_video #45

jamie212 commented Oct 25, 2023

SerialLain3170 commented Nov 7, 2023

jamie212 commented Nov 7, 2023

SerialLain3170 commented Nov 9, 2023

jamie212 commented Nov 13, 2023 •

edited

Loading

SerialLain3170 commented Nov 13, 2023

jamie212 commented Nov 14, 2023

jamie212 commented Dec 6, 2023 •

edited

Loading

SerialLain3170 commented Dec 18, 2023

jamie212 commented Dec 19, 2023

jamie212 commented Dec 31, 2023

SerialLain3170 commented Jan 2, 2024

jamie212 commented Jan 3, 2024

SerialLain3170 commented Jan 3, 2024

Training data for reference_video #45

Training data for reference_video #45

Comments

jamie212 commented Oct 25, 2023

SerialLain3170 commented Nov 7, 2023

jamie212 commented Nov 7, 2023

SerialLain3170 commented Nov 9, 2023

jamie212 commented Nov 13, 2023 • edited Loading

SerialLain3170 commented Nov 13, 2023

jamie212 commented Nov 14, 2023

jamie212 commented Dec 6, 2023 • edited Loading

SerialLain3170 commented Dec 18, 2023

jamie212 commented Dec 19, 2023

jamie212 commented Dec 31, 2023

SerialLain3170 commented Jan 2, 2024

jamie212 commented Jan 3, 2024

SerialLain3170 commented Jan 3, 2024

jamie212 commented Nov 13, 2023 •

edited

Loading

jamie212 commented Dec 6, 2023 •

edited

Loading