Project for the course Vision and Perception 2022 edition, Sapienza University of Rome.
The goal of my project is to create realistic celebrity images using GANs. I used three GANs DC Gan, Least Square GAN and Relativistic GAN. In order to understand their behavior under different condition, I performed training for 15 epochs and fixing the batch size to 128.
In addition to that, I used three different optimizers SGD, Adam and AdaBound with three different values for learning rate (0.01, 0.004, 0.0003).
BadZipFile: File is not a zip file
I find another way to upload the images doing this:
- Copy in my Drive the repository from the official site;
- Accessing every time the file on drive would be very slow, so I create a directory in the current colab session and unzip the file there;
- Using ImageFolder upload the images inside the Dataloader.
It contains 202.600 images.
The whole process takes around 1 minute.
Using a latent vector of 128, the generator and the discriminator have respectively 3.806.080 and 2.766.529 parameters. The generator is made by Conv2DTranspose, BatchNorm2D and ReLu with TanH as the last layer. The discriminator is made using Conv2D, BatchNorm2D and LeakyReLU. BCELoss is used.
Here is shown the progression of the GAN with Adam optimizer, using 0.0003 for the value of the learning rate.
Using the same generator architecture, I hide the last layer of the discriminator (no Sigmoid): it may lead to the vanishing gradients problem during the learning process. Here I used the MSELoss. The computational time is around 1h and 20 minutes.
Here is shown the progression of the GAN with Adam optimizer, using 0.0003 for the value of the learning rate.
Using the same generator architecture, I hide the last layer of the discriminator (no Sigmoid): instead of using it, it will be used a BCEWithLogitsLoss because this version is more numerically stable. In addition to that, a relativistic discriminator is used which compute the probability that the given real image is more "realistic" than a randomly sampled fake image.
Here is shown the progression of the GAN with SGD optimizer, using 0.004 for the value of the learning rate.
Using the best result for each model, I perform more training for 30 epochs using pre-training weights (coming from the previous part). Here is shown the comparison between a real batch of images and a generated one, using Adam as optimizer.
In order to run the .ipynb file, the most important thing is to copy in your GoogleDrive the official folder of the project. It can be done accessing here, click to Align&Cropped Images and save this folder into your local Google Drive. Then everything is done, you just have to run and wait for the results.
For any doubt or clarification send me an email.
Further note: the above material is for an univerisity project. More details can be found in the .ipynb file and the power point presentation.