-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add something in diffusion model #751
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -0,0 +1,1745 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dependencies need to be installed before the file can be used, see the other documentation
Reply via ReviewNB
@@ -0,0 +1,1745 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,2380 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1 +1,1010 @@ | |||
{"cells":[{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":28425,"status":"ok","timestamp":1705748195781,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"eQbVEk-ef2Sb","outputId":"a3eb8ee4-b21c-45b2-b172-cb4d192a2d1c","tags":["hide-cell"]},"outputs":[],"source":["# Install the necessary dependencies\n","\n","import os\n","import sys\n","!{sys.executable} -m pip install --quiet pandas scikit-learn numpy matplotlib jupyterlab_myst ipython tensorflow_addons opencv-python requests"]},{"cell_type":"markdown","metadata":{"id":"Gsposoggf2Sf","tags":["remove-cell"]},"source":["---\n","license:\n"," code: MIT\n"," content: CC-BY-4.0\n","github: https://github.com/ocademy-ai/machine-learning\n","venue: By Ocademy\n","open_access: true\n","bibliography:\n"," - https://raw.githubusercontent.com/ocademy-ai/machine-learning/main/open-machine-learning-jupyter-book/references.bib\n","---"]},{"cell_type":"markdown","metadata":{"id":"XbY9A-fGf2Sg"},"source":["# Diffusion Model\n","\n","## Background\n","\n","Before we learn the diffusion model, we have to know some background knowledge in statistics. Perhaps you already have a good mastery of them, let's review them together.\n","\n","### Expectation\n","\n","#### Definition\n","\n","In probability theory, expectation (also called expected value, mean, average) is a generalization of the weighted average.\n","\n","$E[X] = x_1 p_1 + x_2 p2 + ... +x_n p_n = \\sum_{i=1}^n x_i p_1$\n","\n","where $x_i$ and $p_i$ are i-th a possible outcome and its probability, respectively.\n","\n","#### Properties\n","\n","- $E[aX]=aE[X]$ where a is a constant value.\n","- $E[X+b]=E[X]+b$ where b is a constant value.\n","- $E[X+Y]=E[X]+E[Y]$.\n","\n","### Variance\n","\n","Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value.\n","\n","#### Definition\n","\n","The variance of a random variable $X$ is the expected value of the squared deviation from the expectation of $X$.\n","\n","#### Properties\n","\n","- $Var[X]=E[X^2]−E[X]^2$.\n","- $Var[aX]=a^2 Var[X]$ where a is a constant value.\n","- $Var[X+b]=Var[X]$ where b is a constant value.\n","- $Var[X+Y]=Var[X]+Var[Y]$.\n","\n","### Re-parameterization trick\n","\n","When we sample data from a probability distribution, backpropagation the gradient is not possible because it is a stochastic process. To make it trainable, the re-parameterization trick is useful.\n","\n","Let us assume that $z$ is sampled data from a gaussian distribution which the mean is $\\mu$ and the variance is $\\sigma^2$. Then, the mean and variance of $z$ would be $\\mu$ and $\\sigma^2$. Therefore, $z$ can be written as follows.\n","\n","$z = N(\\mu, \\sigma^2 I) = \\mu + \\sigma \\odot \\epsilon$, where $\\epsilon \\thicksim N(0, I)$\n","\n","where $\\odot$ refers to element-wise product.\n","\n","The mean and variance of $z$ correspond to $\\mu$ and $\\sigma^2$, respectively.\n","\n","#### Mean (Expectation)\n","\n","$E[z] = E[\\mu+\\sigma \\odot \\epsilon] = E[\\mu] + \\sigma E[\\epsilon] = \\mu$\n","\n","The expectation of $\\epsilon$ is 0 by definition.\n","\n","#### Variance\n","\n","$Var[z] = Var[\\mu + \\sigma \\odot \\epsilon] = Var[\\sigma \\odot \\epsilon] = \\sigma^2 Var[\\epsilon] = \\sigma^2$\n","\n","The variance of \\epsilon is 1 by definition.\n","\n","### KL Divergence\n","\n","In mathematical statistics, the Kullback–Leibler divergence (relative entropy), is a type of statistical distance.\n","\n","#### Definition\n","\n","1. Discrete probability distribution\n","\n","$D_{KL}(P||Q) = \\sum_{x \\in X} P(x) log \\frac{P(x)}{Q(x)}$,where $P$ and $Q$ are discrete probability distributions.\n","\n","2. Continuous probability distribution\n","\n","$D_{KL} (P||Q) = \\int _{− \\infty}^{\\infty} p(x) log \\frac{p(x)}{q(x)}dx$, where $p$ and $q$ denote the probability densities of $P$ and $Q$.\n","\n","#### Jensen's inequality\n","\n","In mathematics, Jensen's inequality relates the value of a convex (or concave) function of an integral to the integral of the function.\n","\n","- Convex function: $f(E[X]) \\leqq E[f(X)]$;\n","- Concave function: $f(E[X]) \\geqq E[f(X)]$.\n","\n","#### Properties of KL Divergence\n","\n","- KL Divergence is always non-negative;\n","- The cross-entropy is always larger than the entropy;\n","- Two univariate normal distributions $P$ and $Q$ are simplified to $D_{KL}(P||Q) = log \\frac{\\sigma_q}{\\sigma_p} + \\frac{\\sigma^2_p + (\\mu_p − \\mu_q)^2}{2\\sigma^2_q} − \\frac{1}{2}$.\n","\n","### Evidence lower bound (ELBO)\n","\n","In variational Bayesian methods, the evidence lower bound (often abbreviated ELBO) is a useful lower bound on the log-likelihood of some observed data.\n","\n","#### Definition\n","\n","$ELBO := E_{z∼q_{\\phi}}[log \\frac{pθ(x,z)}{q_{\\phi}(z)}]$\n","\n","where $p_{\\theta}(x, z)$ is joint distribution of $x$ and $z$. $\\theta$ and $\\phi$ are parameters.\n","\n","ELBO is used to obtain the lower bound of the evidence (or log evidence). The evidence is the likelihood function evaluated at a fixed $\\theta$.\n","\n","$evidence := log p_{\\theta}(x)$\n","\n","#### Properties\n","\n","- The evidence is always larger than ELBO;\n","- KL Divergence between $p_{\\theta}(z|x)$ and $q_{\\phi}(z)$ equals $evidence−ELBO$.\n","\n","### Forward and Reverse process\n","\n","The diffusion models make data into a gaussian noise (latent vector) and restore it again. The former is called the forward process, and the latter is called the reverse process.\n","\n","In the forward process, we add a gaussian noise to the data step by step (usually hundreds of steps). The transform of an individual step is defined as follows.\n","\n","$x_t = q(x_t|x_{t−1}) = N(x_t, \\sqrt{1−\\beta_t}x_{t−1}, \\beta_t I)$\n","\n","In the reverse process, we restore image from a gaussian noise (a latent vector). If $\\beta_t$ is small enough, the reverse $q(x_{t-1}|x_t)$ will also be gaussian. It is noteworthy that the reverse process is tractable when conditioned on $x_0$.\n"]},{"cell_type":"markdown","metadata":{"id":"o8O6Ekanf2Sh"},"source":["<img src=\"https://static-1300131294.cos.ap-shanghai.myqcloud.com/images/deep-learning/diffusion-model/01_structure.png\" width=\"90%\" class=\"bg-white mb-1\">"]},{"cell_type":"markdown","metadata":{"id":"YdRposdFf2Si"},"source":["### Noise schedule\n","\n","In diffusion models, the noise schedule define the methodology for iteratively adding noise to an image or for updating a sample based on model outputs. I'll introduce two type of schedules which are linear schedule and cosine schedule. The linear and cosine schedule were introduced by [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and [Improved Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239), respectively.\n","\n","#### Linear schedule\n","\n","In the linear schedule, $\\beta_t$ changes linearly.\n","\n","\n","<img src=\"https://static-1300131294.cos.ap-shanghai.myqcloud.com/images/deep-learning/diffusion-model/02_linear.png\" width=\"90%\" class=\"bg-white mb-1\">\n","\n","Illustration of linear schedule\n","\n","\n","#### Cosine schedule\n","\n","Alex Nichol and Prafulla Dhariwal proposed the cosine schedule to prevent an image from turning into noise too quickly. They construct a different noise schedule in terms of $\\overline{\\alpha_t}$.\n","\n","$\\overline{\\alpha_t} = \\frac{f(t)}{f(0)}$, $f(t) = cos(\\frac{\\frac{t}{T} + s}{1+s} \\frac{\\pi}{2})^2$. By definition, the $\\beta_t$ equals $1 - \\frac{\\overline{\\alpha_t}}{\\overline{\\alpha_{t-1}}}$.\n","\n","\n","<img src=\"https://static-1300131294.cos.ap-shanghai.myqcloud.com/images/deep-learning/diffusion-model/03_cosine.png\" width=\"90%\" class=\"bg-white mb-1\">\n","\n","Illustration of cosine schedule"]},{"cell_type":"markdown","metadata":{"id":"cMJAEw_Cf2Si"},"source":["## Code\n","\n","### Import Libraries"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":4369,"status":"ok","timestamp":1705748200144,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"-cz-yI6Kf2Sj","outputId":"42527c42-5156-45a6-aac3-a9a96e312de5"},"outputs":[],"source":["import tensorflow as tf\n","import numpy as np\n","import cv2\n","import time\n","import requests\n","import zipfile\n","import io\n","import matplotlib.pyplot as plt\n","import matplotlib.animation as animation\n","from tensorflow.keras.models import Model\n","from tensorflow.keras.layers import Layer\n","from tensorflow.keras.layers import (Reshape, Conv2DTranspose, Add, Conv2D, MaxPool2D, Dense,\n"," Flatten, Input, BatchNormalization, Input, MultiHeadAttention)\n","from tensorflow.keras.optimizers import Adam\n","import tensorflow_addons as tfa"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":7,"status":"ok","timestamp":1705748200144,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"oICkvh0lf2Sj"},"outputs":[],"source":["BATCH_SIZE = 32\n","TIME_STEPS = 1000\n","IM_SHAPE = (32,32,3)\n","N_HEADS = 8\n","ATTN_DIM = 256\n","N_GROUPS = 8\n","N_RESNETS = 2\n","LEARNING_RATE = 2e-4\n","EPOCHS = 10\n","FACTOR = 2"]},{"cell_type":"markdown","metadata":{"id":"Ku9fyewjf2Sk"},"source":["### Data Loading\n","The dataset can be downloaded from [here](https://www.kaggle.com/datasets/jessicali9530/celeba-dataset)."]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":27435,"status":"ok","timestamp":1705748227573,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"umD3e18Wf2Sk"},"outputs":[],"source":["url = 'https://static-1300131294.cos.ap-shanghai.myqcloud.com/data/deep-learning/diffusion-model/archive.zip'\n","\n","r = requests.get(url)\n","with zipfile.ZipFile(io.BytesIO(r.content), 'r') as zip_ref:\n"," zip_ref.extractall('./')\n"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":3787,"status":"ok","timestamp":1705748231352,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"Gcql7dBhf2Sk","outputId":"02e1a71a-1bdd-4902-c0ca-1ebd0504034b"},"outputs":[],"source":["ds_train = tf.keras.preprocessing.image_dataset_from_directory(\n"," \"t/celebA\", label_mode=None, image_size=(IM_SHAPE[0], IM_SHAPE[1]), batch_size=BATCH_SIZE)"]},{"cell_type":"markdown","metadata":{"id":"ksGR5RNuf2Sl"},"source":["### Data Preprocessing"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":7,"status":"ok","timestamp":1705748231352,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"s34mJ7TAf2Sl"},"outputs":[],"source":["def preprocess(image):\n"," return tf.cast(image, tf.float32) / 127.5 - 1.0"]},{"cell_type":"markdown","metadata":{"id":"IYY2Me94f2Sl"},"source":["### Data Augmentation"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":7,"status":"ok","timestamp":1705748231353,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"6p2RrotNf2Sl"},"outputs":[],"source":["def augmentation(image):\n"," return tf.image.random_flip_left_right(image)"]},{"cell_type":"markdown","metadata":{"id":"t2Fr-EWdf2Sl"},"source":["### Data"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":6,"status":"ok","timestamp":1705748231353,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"ucEBmDYIf2Sm"},"outputs":[],"source":["train_dataset = (\n"," ds_train\n"," .map(preprocess)\n"," .map(augmentation)\n"," .unbatch()\n"," .shuffle(buffer_size = 1024, reshuffle_each_iteration = True)\n"," .batch(BATCH_SIZE,drop_remainder=True)\n"," .prefetch(tf.data.AUTOTUNE))"]},{"cell_type":"markdown","metadata":{"id":"h4L-JszOf2Sm"},"source":["### Linear schedule-beta"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":6,"status":"ok","timestamp":1705748231353,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"ihHFAYjaf2Sm"},"outputs":[],"source":["def linear_beta_schedule(timesteps):\n"," beta_start = 0.0001\n"," beta_end = 0.02\n"," return tf.linspace(beta_start, beta_end, timesteps)"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":1389,"status":"ok","timestamp":1705748232736,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"tfHPiKjNf2Sm","outputId":"939f22c6-e544-4596-c7c6-3c85f8b5a52d"},"outputs":[],"source":["betas = linear_beta_schedule(TIME_STEPS)\n","print(betas)"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":5,"status":"ok","timestamp":1705748232736,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"pUlMpYl-f2Sm"},"outputs":[],"source":["alphas = 1. - betas\n","alphas_cumprod = tf.math.cumprod(alphas, axis=0)\n","sqrt_alphas_cumprod = tf.math.sqrt(alphas_cumprod)\n","sqrt_one_minus_alphas_cumprod = tf.math.sqrt(1. - alphas_cumprod)"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":5,"status":"ok","timestamp":1705748232736,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"TyngTgNmf2Sm"},"outputs":[],"source":["def extract(a, t, x_shape):\n"," b, *_ = t.shape\n"," out = tf.gather(a,t)\n"," output = tf.reshape(out, (b,*((1,) * (len(x_shape) - 1))))\n"," return output\n","\n","def q_sample(x_start, t, noise):\n","\n"," sqrt_alphas_cumprod_t = extract(sqrt_alphas_cumprod, t, x_start.shape)\n"," sqrt_one_minus_alphas_cumprod_t = extract(sqrt_one_minus_alphas_cumprod, t, x_start.shape)\n","\n"," out_sample = sqrt_alphas_cumprod_t * x_start + sqrt_one_minus_alphas_cumprod_t * noise### x_start=x_0, noise = z\n"," return out_sample"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":6908,"status":"ok","timestamp":1705748239640,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"c3AWw7Zgf2Sm","outputId":"9d6e9996-be5e-472b-a2ec-0b04de960e7e"},"outputs":[],"source":["class PositionalEmbeddings(tf.keras.layers.Layer):\n","\n"," def __init__(self, dim):\n"," super().__init__()\n"," self.embedding_dim = dim\n","\n"," def get_timestep_embedding(self, timesteps, embedding_dim: int):\n"," \"\"\"\n"," From Fairseq.\n"," Build sinusoidal embeddings.\n"," This matches the implementation in tensor2tensor, but differs slightly\n"," from the description in Section 3.5 of \"Attention Is All You Need\".\n"," \"\"\"\n"," half_dim = self.embedding_dim // 2\n"," emb = tf.math.log(10000.) / (half_dim - 1)\n"," emb = tf.exp(tf.range(half_dim, dtype=tf.float32) * -emb)\n"," emb = tf.cast(timesteps, dtype = tf.float32)[:, None] * emb[None, :]\n"," emb = tf.concat([tf.sin(emb), tf.cos(emb)], axis=1)\n"," if embedding_dim % 2 == 1:\n"," emb = tf.pad(emb, [[0, 0], [0, 1]])\n"," return emb\n","\n"," def call(self, time):\n"," return self.get_timestep_embedding(time, self.embedding_dim)\n","def res_block(x,filters,n_groups,temb):\n"," previous = x\n"," x = Conv2D(filters, 3, padding=\"same\",)(x) ### Convolution layer with padding same, so that the resolution remains the same\n","\n"," ### temb represents the time embedding.\n"," ### It is passed into the silu activation function and a Dense Layer(Which can change the the embedding dimension )\n"," ### We also reshape the time embedding to match the output of 2d convnets.\n"," x += Dense(filters)(tf.nn.silu(temb))[:,None,None,:]\n","\n"," ### Group Normalization is used.\n"," x = tf.nn.silu(tfa.layers.GroupNormalization(n_groups, axis = -1)(x))\n"," x = Conv2D(filters, 3, padding=\"same\",)(x)\n","\n"," # Project residual\n"," residual = Conv2D(filters, 1,padding=\"same\",)(previous)\n"," x = tf.keras.layers.add([x, residual]) # Add back residual\n"," return x\n","\n","def get_model(im_shape=(64,64,3),n_resnets=2,n_groups=8,attn_dim=32,n_heads=4,):\n"," input_1 = Input(shape=im_shape)### image input\n"," input_2 = Input(shape=())### time input\n"," t_dim = im_shape[0]*16\n","\n"," # Entry block\n"," x = Conv2D(32, 3, padding=\"same\")(input_1)\n"," temb = PositionalEmbeddings(t_dim)(input_2)### Create embeddings from the time input_2\n"," temb = Dense(t_dim)(tf.nn.gelu(Dense(t_dim)(temb)))### pass the embedding into the gelu activation function\n","\n"," hs = [x]### variable used for storing each resolution level output, in the downward path, to be concatenated to the inputs of the upward path.\n","\n"," ### Downward Path\n"," for filters in [32, 64, 128, 256]:### for every resolution level (32,64,128,256), represent the depth they map to resolutions of (32,16,8,4)\n"," for _ in range(n_resnets):### we go through each resnet block per resolution level\n"," x = res_block(x,filters,n_groups,temb)### resblock\n"," ### if the resolution=16 (coinciding with a depth=64), we make the resnet output features attend to each other.\n"," ### Note how the attention axes = (1,2). This corresponds to the height and width dimensions.\n"," ### Feel free to Check the documentation out :) https://www.tensorflow.org/api_docs/python/tf/keras/layers/MultiHeadAttention.\n"," ### query = key = value = x.\n"," ### We again use Group Normalization.\n"," if filters == 64:\n"," x = tfa.layers.GroupNormalization(groups=n_groups, axis = -1)(\n"," MultiHeadAttention(num_heads=n_heads, key_dim=attn_dim, attention_axes=(1,2), )(query = x, value = x))\n"," hs.append(x)### append the output features to hs\n"," x = tf.keras.layers.MaxPooling2D(3, strides=2, padding=\"same\")(x)### Downsampling in order to move to the next resolution level\n","\n","\n"," ### Bottleneck\n"," x = res_block(x,256,n_groups,temb)\n"," x = tfa.layers.GroupNormalization(groups=n_groups, axis = -1)(\n"," MultiHeadAttention(num_heads=n_heads, key_dim=attn_dim, attention_axes=(1,2), )(query = x, value = x))\n"," x = res_block(x,256,n_groups,temb)\n","\n","\n"," ### Upward path\n"," for filters in [256, 128, 64,32]:\n"," ### we resize x, to match with the shape of feature outputs (hs) in the downward path\n"," x = tf.image.resize_with_pad(x,hs[-1].shape[1],hs[-1].shape[2])\n"," x = tf.concat([x,hs.pop()], axis=-1)\n","\n"," for _ in range(n_resnets):\n"," x = res_block(x,filters,n_groups,temb)\n","\n"," if filters == 64:\n"," x = tfa.layers.GroupNormalization(groups=n_groups, axis = -1)(\n"," MultiHeadAttention(num_heads=n_heads, key_dim=attn_dim, attention_axes=(1,2), )(query = x, value = x))\n","\n"," if filters != 32:\n"," x = Conv2DTranspose(filters, 3, strides = (2,2),)(x)### Upsampling\n","\n"," x = res_block(x,32,n_groups,temb)\n"," outputs = Conv2D(3, 3, padding=\"same\", )(x)\n","\n"," # Define the model\n"," model = Model([input_1,input_2], outputs,name='unet')\n"," return model\n","\n","model= get_model(im_shape=IM_SHAPE,n_resnets=N_RESNETS,n_groups=N_GROUPS,attn_dim=ATTN_DIM,n_heads=N_HEADS,)\n","model.summary()"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":1711161,"status":"ok","timestamp":1705750563313,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"oejrIVvof2Sn","outputId":"184d9610-39d4-4571-d112-71af8afb5962"},"outputs":[],"source":["class LRSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):\n","\n"," def __init__(self, init_lr):\n"," self.init_lr = init_lr\n","\n"," def __call__(self, step):\n"," return self.init_lr*(100000/(step+100000))\n","\n","OPTIMIZER = Adam(learning_rate=LRSchedule(1e-4))\n","\n","def custom_loss(denoise_model, x_start, t, noise=None):\n"," ### Our custom loss function takes in the predicted noise and compares (using the Huber Loss) it with the actual noise\n"," ### Huber Loss with a default value for delta as 1.0 Check out the documentation: https://www.tensorflow.org/api_docs/python/tf/keras/losses/Huber\n"," h = tf.keras.losses.Huber()\n"," noise = tf.random.normal(x_start.shape,mean=0,stddev=1)### noise=epsilon=z\n"," x_noisy = q_sample(x_start,t,noise)### x_t using the q_sample method\n"," predicted_noise = denoise_model([x_noisy, t])### model takes in the x_t and t and outputs noise\n"," return h(noise,predicted_noise)\n","\n","### custom training block\n","### You can use tf.function to make graphs out of your programs. It is a transformation tool that creates Python-independent dataflow graphs\n","### out of your Python code. This will help you create performant and portable models.\n","@tf.function\n","def training_block(x_batch):\n"," with tf.GradientTape() as recorder:\n"," ### for every element in the batch, we generate t randomly\n"," t = tf.random.uniform((BATCH_SIZE,),minval=0,maxval=TIME_STEPS,dtype=tf.int32)\n"," loss = custom_loss(model,x_batch,t)\n","\n"," partial_derivatives = recorder.gradient(loss, model.trainable_weights)\n"," OPTIMIZER.apply_gradients(zip(partial_derivatives, model.trainable_weights))### gradient descent\n"," return loss\n","\n","def neuralearn(EPOCHS):\n"," for epoch in range(EPOCHS):\n"," init_time = time.time()\n"," losses = []\n"," for step, x_batch in enumerate(train_dataset):\n"," loss = training_block(x_batch)\n"," losses.append(loss)\n","\n"," print(str(epoch+1)+\"/\"+str(EPOCHS)+\": Training Loss----->\", sum(losses)/len(losses))\n"," print('Time Elapsed: ---> '+str(time.time()-init_time)+' s')\n","\n"," print(\"Training Complete!!!!\")\n","\n","neuralearn(EPOCHS)"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":1000},"executionInfo":{"elapsed":288500,"status":"ok","timestamp":1705750851807,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"BS6NjWn3f2Sn","outputId":"07e309c0-1e9e-4291-f1c3-0b45d51f0e4d"},"outputs":[],"source":["sqrt_recip_alphas = tf.math.sqrt(1.0 / alphas)\n","alphas_cumprod_prev = tf.concat([tf.ones((1,)),alphas_cumprod[:-1]],axis = 0)### alpha_t-1_bar = alphas_cumprod_prev\n","\n","posterior_variance = betas * (1. - alphas_cumprod_prev) / (1. - alphas_cumprod)\n","\n","def p_sample(model, x, t, t_index):\n","\n"," betas_t = extract(betas, t, x.shape)### betas_t = 1-alphas_t\n"," sqrt_one_minus_alphas_cumprod_t = extract(sqrt_one_minus_alphas_cumprod, t, x.shape)### square root of 1-alpha_t_bar\n"," sqrt_recip_alphas_t = extract(sqrt_recip_alphas, t, x.shape)### 1/square root of alpha_t\n","\n"," model_mean = sqrt_recip_alphas_t * (x - betas_t * model([x, t]) / sqrt_one_minus_alphas_cumprod_t)### equation 4 of algorithm 2 above\n","\n"," if t_index == 0:\n"," return model_mean\n"," else:\n"," posterior_variance_t = extract(posterior_variance, t, x.shape)### sigma_t\n"," noise = tf.random.normal(x.shape)\n"," return model_mean + tf.math.sqrt(posterior_variance_t) * noise\n","\n","imgs = []\n","img = tf.random.normal((64,IM_SHAPE[0],IM_SHAPE[1],IM_SHAPE[2]))\n","for i in reversed(range(0, TIME_STEPS)):### we go backwards from t = 1000 to t = 0\n"," print(i)\n"," img = p_sample(model,img,tf.fill((1,),i,), i)\n"," imgs.append(img)\n","\n","plt.figure(figsize = (16,16))\n","\n","for i in range(32):\n"," ax = plt.subplot(4,8, i+1)\n"," plt.imshow(np.array(imgs[999])[i])\n"," plt.axis(\"off\")"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":1000},"executionInfo":{"elapsed":178410,"status":"ok","timestamp":1705751030229,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"bZ7YU-LNf2Sn","outputId":"3637cd6d-26e0-4d3d-8536-d420ace5b550"},"outputs":[],"source":["random_index = 0\n","\n","fig = plt.figure()\n","ims = []\n","for i in range(TIME_STEPS):\n"," im = plt.imshow(np.array(imgs[i])[random_index], animated=True)\n"," ims.append([im])\n"," ims.append([im])\n","\n","animate = animation.ArtistAnimation(fig, ims, interval=5, blit=True, repeat_delay=1000)\n","animate.save('diffusion.gif')\n","plt.show()"]},{"cell_type":"markdown","metadata":{"id":"UJabZ9MEf2Sn"},"source":["## Your turn! 🚀\n","\n","Assignment - [Denoising difussion model](../assignments/deep-learning/difussion-model/denoising-difussion-model.ipynb)\n","\n","## Acknowledgments\n","\n","Thanks to [Kwangnam Yu](https://github.com/phykn) for creating the open-source project [diffusion_models_tutorial](https://github.com/phykn/diffusion_models_tutorial) and [kaggle](https://www.kaggle.com/) for creating the open-source course [Denoising Diffusion Models with TensorFlow](https://www.kaggle.com/code/folefac/denoising-diffusion-models-with-tensorflow#3.-The-Celeb-A-Dataset-%F0%9F%92%BE). They inspire the majority of the content in this chapter."]}],"metadata":{"accelerator":"GPU","colab":{"gpuType":"T4","provenance":[]},"kernelspec":{"display_name":"Python 3","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.9.18"}},"nbformat":4,"nbformat_minor":0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are too many files being pushed that are not part of the changes, make sure you pull the latest branch, upload it again and only upload the changes! |
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
How has this been tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Checklist: