add something in diffusion model #751

anyiyou11 · 2024-04-21T07:15:13Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How has this been tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

local build the book
review the artifacts from local build
local build the slides if appliable
Unit test if appliable

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have applied tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules

review-notebook-app · 2024-04-21T07:15:18Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Lola-jo · 2024-04-27T05:01:56Z

open-machine-learning-jupyter-book/assignments/prerequisites/python-programming-advanced.ipynb

@@ -0,0 +1,1745 @@
+{


The dependencies need to be installed before the file can be used, see the other documentation

Reply via ReviewNB

Lola-jo · 2024-04-27T05:01:56Z

open-machine-learning-jupyter-book/assignments/prerequisites/python-programming-advanced.ipynb

@@ -0,0 +1,1745 @@
+{


Refer to other documents for acknowledgements, no thanks to gpt

Reply via ReviewNB

Lola-jo · 2024-04-27T05:01:56Z

open-machine-learning-jupyter-book/assignments/prerequisites/python-programming-basics.ipynb

@@ -0,0 +1,2380 @@
+{


Same

Reply via ReviewNB

Lola-jo · 2024-04-27T05:01:56Z

open-machine-learning-jupyter-book/deep-learning/difussion-model.ipynb

@@ -1 +1,1010 @@
-{"cells":[{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":28425,"status":"ok","timestamp":1705748195781,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"eQbVEk-ef2Sb","outputId":"a3eb8ee4-b21c-45b2-b172-cb4d192a2d1c","tags":["hide-cell"]},"outputs":[],"source":["# Install the necessary dependencies\n","\n","import os\n","import sys\n","!{sys.executable} -m pip install --quiet pandas scikit-learn numpy matplotlib jupyterlab_myst ipython tensorflow_addons opencv-python requests"]},{"cell_type":"markdown","metadata":{"id":"Gsposoggf2Sf","tags":["remove-cell"]},"source":["---\n","license:\n","    code: MIT\n","    content: CC-BY-4.0\n","github: https://github.com/ocademy-ai/machine-learning\n","venue: By Ocademy\n","open_access: true\n","bibliography:\n","  - https://raw.githubusercontent.com/ocademy-ai/machine-learning/main/open-machine-learning-jupyter-book/references.bib\n","---"]},{"cell_type":"markdown","metadata":{"id":"XbY9A-fGf2Sg"},"source":["# Diffusion Model\n","\n","## Background\n","\n","Before we learn the diffusion model, we have to know some background knowledge in statistics. Perhaps you already have a good mastery of them, let's review them together.\n","\n","### Expectation\n","\n","#### Definition\n","\n","In probability theory, expectation (also called expected value, mean, average) is a generalization of the weighted average.\n","\n","$E[X] = x_1 p_1 + x_2 p2 + ... +x_n p_n = \\sum_{i=1}^n x_i p_1$\n","\n","where $x_i$ and $p_i$ are i-th a possible outcome and its probability, respectively.\n","\n","#### Properties\n","\n","- $E[aX]=aE[X]$ where a is a constant value.\n","- $E[X+b]=E[X]+b$ where b is a constant value.\n","- $E[X+Y]=E[X]+E[Y]$.\n","\n","### Variance\n","\n","Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value.\n","\n","#### Definition\n","\n","The variance of a random variable $X$ is the expected value of the squared deviation from the expectation of $X$.\n","\n","#### Properties\n","\n","- $Var[X]=E[X^2]−E[X]^2$.\n","- $Var[aX]=a^2 Var[X]$ where a is a constant value.\n","- $Var[X+b]=Var[X]$ where b is a constant value.\n","- $Var[X+Y]=Var[X]+Var[Y]$.\n","\n","### Re-parameterization trick\n","\n","When we sample data from a probability distribution, backpropagation the gradient is not possible because it is a stochastic process. To make it trainable, the re-parameterization trick is useful.\n","\n","Let us assume that $z$ is sampled data from a gaussian distribution which the mean is $\\mu$ and the variance is $\\sigma^2$. Then, the mean and variance of $z$ would be $\\mu$ and $\\sigma^2$. Therefore, $z$ can be written as follows.\n","\n","$z = N(\\mu, \\sigma^2 I) = \\mu + \\sigma \\odot \\epsilon$, where $\\epsilon \\thicksim N(0, I)$\n","\n","where $\\odot$ refers to element-wise product.\n","\n","The mean and variance of $z$ correspond to $\\mu$ and $\\sigma^2$, respectively.\n","\n","#### Mean (Expectation)\n","\n","$E[z] = E[\\mu+\\sigma \\odot \\epsilon] = E[\\mu] + \\sigma E[\\epsilon] = \\mu$\n","\n","The expectation of $\\epsilon$ is 0 by definition.\n","\n","#### Variance\n","\n","$Var[z] = Var[\\mu + \\sigma \\odot \\epsilon] = Var[\\sigma \\odot \\epsilon] = \\sigma^2 Var[\\epsilon] = \\sigma^2$\n","\n","The variance of \\epsilon is 1 by definition.\n","\n","### KL Divergence\n","\n","In mathematical statistics, the Kullback–Leibler divergence (relative entropy), is a type of statistical distance.\n","\n","#### Definition\n","\n","1. Discrete probability distribution\n","\n","$D_{KL}(P||Q) = \\sum_{x \\in X} P(x) log \\frac{P(x)}{Q(x)}$,where $P$ and $Q$ are discrete probability distributions.\n","\n","2. Continuous probability distribution\n","\n","$D_{KL} (P||Q) = \\int _{− \\infty}^{\\infty} p(x) log \\frac{p(x)}{q(x)}dx$, where $p$ and $q$ denote the probability densities of $P$ and $Q$.\n","\n","#### Jensen's inequality\n","\n","In mathematics, Jensen's inequality relates the value of a convex (or concave) function of an integral to the integral of the function.\n","\n","- Convex function: $f(E[X]) \\leqq E[f(X)]$;\n","- Concave function: $f(E[X]) \\geqq E[f(X)]$.\n","\n","#### Properties of KL Divergence\n","\n","- KL Divergence is always non-negative;\n","- The cross-entropy is always larger than the entropy;\n","- Two univariate normal distributions $P$ and $Q$ are simplified to $D_{KL}(P||Q) = log \\frac{\\sigma_q}{\\sigma_p} + \\frac{\\sigma^2_p + (\\mu_p − \\mu_q)^2}{2\\sigma^2_q} − \\frac{1}{2}$.\n","\n","### Evidence lower bound (ELBO)\n","\n","In variational Bayesian methods, the evidence lower bound (often abbreviated ELBO) is a useful lower bound on the log-likelihood of some observed data.\n","\n","#### Definition\n","\n","$ELBO := E_{z∼q_{\\phi}}[log \\frac{pθ(x,z)}{q_{\\phi}(z)}]$\n","\n","where $p_{\\theta}(x, z)$ is joint distribution of $x$ and $z$. $\\theta$ and $\\phi$ are parameters.\n","\n","ELBO is used to obtain the lower bound of the evidence (or log evidence). The evidence is the likelihood function evaluated at a fixed $\\theta$.\n","\n","$evidence := log p_{\\theta}(x)$\n","\n","#### Properties\n","\n","- The evidence is always larger than ELBO;\n","- KL Divergence between $p_{\\theta}(z|x)$ and $q_{\\phi}(z)$ equals $evidence−ELBO$.\n","\n","### Forward and Reverse process\n","\n","The diffusion models make data into a gaussian noise (latent vector) and restore it again. The former is called the forward process, and the latter is called the reverse process.\n","\n","In the forward process, we add a gaussian noise to the data step by step (usually hundreds of steps). The transform of an individual step is defined as follows.\n","\n","$x_t = q(x_t|x_{t−1}) = N(x_t, \\sqrt{1−\\beta_t}x_{t−1}, \\beta_t I)$\n","\n","In the reverse process, we restore image from a gaussian noise (a latent vector). If $\\beta_t$ is small enough, the reverse $q(x_{t-1}|x_t)$ will also be gaussian. It is noteworthy that the reverse process is tractable when conditioned on $x_0$.\n"]},{"cell_type":"markdown","metadata":{"id":"o8O6Ekanf2Sh"},"source":["<img src=\"https://static-1300131294.cos.ap-shanghai.myqcloud.com/images/deep-learning/diffusion-model/01_structure.png\" width=\"90%\" class=\"bg-white mb-1\">"]},{"cell_type":"markdown","metadata":{"id":"YdRposdFf2Si"},"source":["### Noise schedule\n","\n","In diffusion models, the noise schedule define the methodology for iteratively adding noise to an image or for updating a sample based on model outputs. I'll introduce two type of schedules which are linear schedule and cosine schedule. The linear and cosine schedule were introduced by [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and [Improved Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239), respectively.\n","\n","#### Linear schedule\n","\n","In the linear schedule, $\\beta_t$ changes linearly.\n","\n","\n","<img src=\"https://static-1300131294.cos.ap-shanghai.myqcloud.com/images/deep-learning/diffusion-model/02_linear.png\" width=\"90%\" class=\"bg-white mb-1\">\n","\n","Illustration of linear schedule\n","\n","\n","#### Cosine schedule\n","\n","Alex Nichol and Prafulla Dhariwal proposed the cosine schedule to prevent an image from turning into noise too quickly. They construct a different noise schedule in terms of $\\overline{\\alpha_t}$.\n","\n","$\\overline{\\alpha_t} = \\frac{f(t)}{f(0)}$, $f(t) = cos(\\frac{\\frac{t}{T} + s}{1+s} \\frac{\\pi}{2})^2$. By definition, the $\\beta_t$ equals $1 - \\frac{\\overline{\\alpha_t}}{\\overline{\\alpha_{t-1}}}$.\n","\n","\n","<img src=\"https://static-1300131294.cos.ap-shanghai.myqcloud.com/images/deep-learning/diffusion-model/03_cosine.png\" width=\"90%\" class=\"bg-white mb-1\">\n","\n","Illustration of cosine schedule"]},{"cell_type":"markdown","metadata":{"id":"cMJAEw_Cf2Si"},"source":["## Code\n","\n","### Import Libraries"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":4369,"status":"ok","timestamp":1705748200144,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"-cz-yI6Kf2Sj","outputId":"42527c42-5156-45a6-aac3-a9a96e312de5"},"outputs":[],"source":["import tensorflow as tf\n","import numpy as np\n","import cv2\n","import time\n","import requests\n","import zipfile\n","import io\n","import matplotlib.pyplot as plt\n","import matplotlib.animation as animation\n","from tensorflow.keras.models import Model\n","from tensorflow.keras.layers import Layer\n","from tensorflow.keras.layers import (Reshape, Conv2DTranspose, Add, Conv2D, MaxPool2D, Dense,\n","                                     Flatten, Input, BatchNormalization, Input, MultiHeadAttention)\n","from tensorflow.keras.optimizers import Adam\n","import tensorflow_addons as tfa"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":7,"status":"ok","timestamp":1705748200144,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"oICkvh0lf2Sj"},"outputs":[],"source":["BATCH_SIZE = 32\n","TIME_STEPS = 1000\n","IM_SHAPE = (32,32,3)\n","N_HEADS = 8\n","ATTN_DIM = 256\n","N_GROUPS = 8\n","N_RESNETS = 2\n","LEARNING_RATE = 2e-4\n","EPOCHS = 10\n","FACTOR = 2"]},{"cell_type":"markdown","metadata":{"id":"Ku9fyewjf2Sk"},"source":["### Data Loading\n","The dataset can be downloaded from [here](https://www.kaggle.com/datasets/jessicali9530/celeba-dataset)."]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":27435,"status":"ok","timestamp":1705748227573,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"umD3e18Wf2Sk"},"outputs":[],"source":["url = 'https://static-1300131294.cos.ap-shanghai.myqcloud.com/data/deep-learning/diffusion-model/archive.zip'\n","\n","r = requests.get(url)\n","with zipfile.ZipFile(io.BytesIO(r.content), 'r') as zip_ref:\n","    zip_ref.extractall('./')\n"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":3787,"status":"ok","timestamp":1705748231352,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"Gcql7dBhf2Sk","outputId":"02e1a71a-1bdd-4902-c0ca-1ebd0504034b"},"outputs":[],"source":["ds_train = tf.keras.preprocessing.image_dataset_from_directory(\n","    \"t/celebA\", label_mode=None, image_size=(IM_SHAPE[0], IM_SHAPE[1]), batch_size=BATCH_SIZE)"]},{"cell_type":"markdown","metadata":{"id":"ksGR5RNuf2Sl"},"source":["### Data Preprocessing"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":7,"status":"ok","timestamp":1705748231352,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"s34mJ7TAf2Sl"},"outputs":[],"source":["def preprocess(image):\n","    return tf.cast(image, tf.float32) / 127.5 - 1.0"]},{"cell_type":"markdown","metadata":{"id":"IYY2Me94f2Sl"},"source":["### Data Augmentation"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":7,"status":"ok","timestamp":1705748231353,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"6p2RrotNf2Sl"},"outputs":[],"source":["def augmentation(image):\n","    return tf.image.random_flip_left_right(image)"]},{"cell_type":"markdown","metadata":{"id":"t2Fr-EWdf2Sl"},"source":["### Data"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":6,"status":"ok","timestamp":1705748231353,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"ucEBmDYIf2Sm"},"outputs":[],"source":["train_dataset = (\n","    ds_train\n","    .map(preprocess)\n","    .map(augmentation)\n","    .unbatch()\n","    .shuffle(buffer_size = 1024, reshuffle_each_iteration = True)\n","    .batch(BATCH_SIZE,drop_remainder=True)\n","    .prefetch(tf.data.AUTOTUNE))"]},{"cell_type":"markdown","metadata":{"id":"h4L-JszOf2Sm"},"source":["### Linear schedule-beta"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":6,"status":"ok","timestamp":1705748231353,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"ihHFAYjaf2Sm"},"outputs":[],"source":["def linear_beta_schedule(timesteps):\n","    beta_start = 0.0001\n","    beta_end = 0.02\n","    return tf.linspace(beta_start, beta_end, timesteps)"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":1389,"status":"ok","timestamp":1705748232736,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"tfHPiKjNf2Sm","outputId":"939f22c6-e544-4596-c7c6-3c85f8b5a52d"},"outputs":[],"source":["betas = linear_beta_schedule(TIME_STEPS)\n","print(betas)"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":5,"status":"ok","timestamp":1705748232736,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"pUlMpYl-f2Sm"},"outputs":[],"source":["alphas = 1. - betas\n","alphas_cumprod = tf.math.cumprod(alphas, axis=0)\n","sqrt_alphas_cumprod = tf.math.sqrt(alphas_cumprod)\n","sqrt_one_minus_alphas_cumprod = tf.math.sqrt(1. - alphas_cumprod)"]},{"cell_type":"code","execution_count":null,"metadata":{"executionInfo":{"elapsed":5,"status":"ok","timestamp":1705748232736,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"TyngTgNmf2Sm"},"outputs":[],"source":["def extract(a, t, x_shape):\n","    b, *_ = t.shape\n","    out = tf.gather(a,t)\n","    output = tf.reshape(out, (b,*((1,) * (len(x_shape) - 1))))\n","    return output\n","\n","def q_sample(x_start, t, noise):\n","\n","    sqrt_alphas_cumprod_t = extract(sqrt_alphas_cumprod, t, x_start.shape)\n","    sqrt_one_minus_alphas_cumprod_t = extract(sqrt_one_minus_alphas_cumprod, t, x_start.shape)\n","\n","    out_sample = sqrt_alphas_cumprod_t * x_start + sqrt_one_minus_alphas_cumprod_t * noise### x_start=x_0, noise = z\n","    return out_sample"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":6908,"status":"ok","timestamp":1705748239640,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"c3AWw7Zgf2Sm","outputId":"9d6e9996-be5e-472b-a2ec-0b04de960e7e"},"outputs":[],"source":["class PositionalEmbeddings(tf.keras.layers.Layer):\n","\n","    def __init__(self, dim):\n","        super().__init__()\n","        self.embedding_dim = dim\n","\n","    def get_timestep_embedding(self, timesteps, embedding_dim: int):\n","        \"\"\"\n","        From Fairseq.\n","        Build sinusoidal embeddings.\n","        This matches the implementation in tensor2tensor, but differs slightly\n","        from the description in Section 3.5 of \"Attention Is All You Need\".\n","        \"\"\"\n","        half_dim = self.embedding_dim // 2\n","        emb = tf.math.log(10000.) / (half_dim - 1)\n","        emb = tf.exp(tf.range(half_dim, dtype=tf.float32) * -emb)\n","        emb = tf.cast(timesteps, dtype = tf.float32)[:, None] * emb[None, :]\n","        emb = tf.concat([tf.sin(emb), tf.cos(emb)], axis=1)\n","        if embedding_dim % 2 == 1:\n","            emb = tf.pad(emb, [[0, 0], [0, 1]])\n","        return emb\n","\n","    def call(self, time):\n","        return self.get_timestep_embedding(time, self.embedding_dim)\n","def res_block(x,filters,n_groups,temb):\n","    previous = x\n","    x = Conv2D(filters, 3, padding=\"same\",)(x) ### Convolution layer with padding same, so that the resolution remains the same\n","\n","    ### temb represents the time embedding.\n","    ### It is passed into the silu activation function and a Dense Layer(Which can change the the embedding dimension )\n","    ### We also reshape the time embedding to match the output of 2d convnets.\n","    x += Dense(filters)(tf.nn.silu(temb))[:,None,None,:]\n","\n","    ### Group Normalization is used.\n","    x = tf.nn.silu(tfa.layers.GroupNormalization(n_groups, axis = -1)(x))\n","    x = Conv2D(filters, 3, padding=\"same\",)(x)\n","\n","    # Project residual\n","    residual = Conv2D(filters, 1,padding=\"same\",)(previous)\n","    x = tf.keras.layers.add([x, residual])  # Add back residual\n","    return x\n","\n","def get_model(im_shape=(64,64,3),n_resnets=2,n_groups=8,attn_dim=32,n_heads=4,):\n","    input_1 = Input(shape=im_shape)### image input\n","    input_2 = Input(shape=())### time input\n","    t_dim = im_shape[0]*16\n","\n","    # Entry block\n","    x = Conv2D(32, 3, padding=\"same\")(input_1)\n","    temb = PositionalEmbeddings(t_dim)(input_2)### Create embeddings from the time input_2\n","    temb = Dense(t_dim)(tf.nn.gelu(Dense(t_dim)(temb)))### pass the embedding into the gelu activation function\n","\n","    hs = [x]### variable used for storing each resolution level output, in the downward path, to be concatenated to the inputs of the upward path.\n","\n","    ### Downward Path\n","    for filters in [32, 64, 128, 256]:### for every resolution level (32,64,128,256), represent the depth they map to resolutions of (32,16,8,4)\n","        for _ in range(n_resnets):### we go through each resnet block per resolution level\n","            x = res_block(x,filters,n_groups,temb)### resblock\n","            ### if the resolution=16 (coinciding with a depth=64), we make the resnet output features attend to each other.\n","            ### Note how the attention axes = (1,2). This corresponds to the height and width dimensions.\n","            ### Feel free to Check the documentation out :) https://www.tensorflow.org/api_docs/python/tf/keras/layers/MultiHeadAttention.\n","            ### query = key = value = x.\n","            ### We again use Group Normalization.\n","            if filters == 64:\n","                x = tfa.layers.GroupNormalization(groups=n_groups, axis = -1)(\n","                    MultiHeadAttention(num_heads=n_heads, key_dim=attn_dim, attention_axes=(1,2), )(query = x, value = x))\n","        hs.append(x)### append the output features to hs\n","        x = tf.keras.layers.MaxPooling2D(3, strides=2, padding=\"same\")(x)### Downsampling in order to move to the next resolution level\n","\n","\n","    ### Bottleneck\n","    x = res_block(x,256,n_groups,temb)\n","    x = tfa.layers.GroupNormalization(groups=n_groups, axis = -1)(\n","      MultiHeadAttention(num_heads=n_heads, key_dim=attn_dim, attention_axes=(1,2), )(query = x, value = x))\n","    x = res_block(x,256,n_groups,temb)\n","\n","\n","    ### Upward path\n","    for filters in [256, 128, 64,32]:\n","        ### we resize x, to match with the shape of feature outputs (hs) in the downward path\n","        x = tf.image.resize_with_pad(x,hs[-1].shape[1],hs[-1].shape[2])\n","        x = tf.concat([x,hs.pop()], axis=-1)\n","\n","        for _ in range(n_resnets):\n","            x = res_block(x,filters,n_groups,temb)\n","\n","            if filters == 64:\n","                x = tfa.layers.GroupNormalization(groups=n_groups, axis = -1)(\n","                  MultiHeadAttention(num_heads=n_heads, key_dim=attn_dim, attention_axes=(1,2), )(query = x, value = x))\n","\n","        if filters != 32:\n","            x = Conv2DTranspose(filters, 3, strides = (2,2),)(x)### Upsampling\n","\n","    x = res_block(x,32,n_groups,temb)\n","    outputs = Conv2D(3, 3, padding=\"same\", )(x)\n","\n","    # Define the model\n","    model = Model([input_1,input_2], outputs,name='unet')\n","    return model\n","\n","model= get_model(im_shape=IM_SHAPE,n_resnets=N_RESNETS,n_groups=N_GROUPS,attn_dim=ATTN_DIM,n_heads=N_HEADS,)\n","model.summary()"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":1711161,"status":"ok","timestamp":1705750563313,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"oejrIVvof2Sn","outputId":"184d9610-39d4-4571-d112-71af8afb5962"},"outputs":[],"source":["class LRSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):\n","\n","    def __init__(self, init_lr):\n","        self.init_lr = init_lr\n","\n","    def __call__(self, step):\n","        return self.init_lr*(100000/(step+100000))\n","\n","OPTIMIZER = Adam(learning_rate=LRSchedule(1e-4))\n","\n","def custom_loss(denoise_model, x_start, t, noise=None):\n","    ### Our custom loss function takes in the predicted noise and compares (using the Huber Loss) it with the actual noise\n","    ### Huber Loss with a default value for delta as 1.0 Check out the documentation: https://www.tensorflow.org/api_docs/python/tf/keras/losses/Huber\n","    h = tf.keras.losses.Huber()\n","    noise = tf.random.normal(x_start.shape,mean=0,stddev=1)### noise=epsilon=z\n","    x_noisy = q_sample(x_start,t,noise)### x_t using the q_sample method\n","    predicted_noise = denoise_model([x_noisy, t])### model takes in the x_t and t and outputs noise\n","    return h(noise,predicted_noise)\n","\n","### custom training block\n","### You can use tf.function to make graphs out of your programs. It is a transformation tool that creates Python-independent dataflow graphs\n","### out of your Python code. This will help you create performant and portable models.\n","@tf.function\n","def training_block(x_batch):\n","    with tf.GradientTape() as recorder:\n","        ### for every element in the batch, we generate t randomly\n","        t = tf.random.uniform((BATCH_SIZE,),minval=0,maxval=TIME_STEPS,dtype=tf.int32)\n","        loss = custom_loss(model,x_batch,t)\n","\n","    partial_derivatives = recorder.gradient(loss, model.trainable_weights)\n","    OPTIMIZER.apply_gradients(zip(partial_derivatives, model.trainable_weights))### gradient descent\n","    return loss\n","\n","def neuralearn(EPOCHS):\n","    for epoch in range(EPOCHS):\n","        init_time = time.time()\n","        losses = []\n","        for step, x_batch in enumerate(train_dataset):\n","            loss = training_block(x_batch)\n","            losses.append(loss)\n","\n","        print(str(epoch+1)+\"/\"+str(EPOCHS)+\": Training Loss----->\", sum(losses)/len(losses))\n","        print('Time Elapsed: ---> '+str(time.time()-init_time)+' s')\n","\n","    print(\"Training Complete!!!!\")\n","\n","neuralearn(EPOCHS)"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":1000},"executionInfo":{"elapsed":288500,"status":"ok","timestamp":1705750851807,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"BS6NjWn3f2Sn","outputId":"07e309c0-1e9e-4291-f1c3-0b45d51f0e4d"},"outputs":[],"source":["sqrt_recip_alphas = tf.math.sqrt(1.0 / alphas)\n","alphas_cumprod_prev = tf.concat([tf.ones((1,)),alphas_cumprod[:-1]],axis = 0)### alpha_t-1_bar = alphas_cumprod_prev\n","\n","posterior_variance = betas * (1. - alphas_cumprod_prev) / (1. - alphas_cumprod)\n","\n","def p_sample(model, x, t, t_index):\n","\n","    betas_t = extract(betas, t, x.shape)### betas_t = 1-alphas_t\n","    sqrt_one_minus_alphas_cumprod_t = extract(sqrt_one_minus_alphas_cumprod, t, x.shape)### square root of 1-alpha_t_bar\n","    sqrt_recip_alphas_t = extract(sqrt_recip_alphas, t, x.shape)### 1/square root of alpha_t\n","\n","    model_mean = sqrt_recip_alphas_t * (x - betas_t * model([x, t]) / sqrt_one_minus_alphas_cumprod_t)### equation 4 of algorithm 2 above\n","\n","    if t_index == 0:\n","        return model_mean\n","    else:\n","        posterior_variance_t = extract(posterior_variance, t, x.shape)### sigma_t\n","        noise = tf.random.normal(x.shape)\n","        return model_mean + tf.math.sqrt(posterior_variance_t) * noise\n","\n","imgs = []\n","img = tf.random.normal((64,IM_SHAPE[0],IM_SHAPE[1],IM_SHAPE[2]))\n","for i in reversed(range(0, TIME_STEPS)):### we go backwards from t = 1000 to t = 0\n","    print(i)\n","    img = p_sample(model,img,tf.fill((1,),i,), i)\n","    imgs.append(img)\n","\n","plt.figure(figsize = (16,16))\n","\n","for i in range(32):\n","    ax = plt.subplot(4,8, i+1)\n","    plt.imshow(np.array(imgs[999])[i])\n","    plt.axis(\"off\")"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":1000},"executionInfo":{"elapsed":178410,"status":"ok","timestamp":1705751030229,"user":{"displayName":"qiongying fu","userId":"01979569471380266441"},"user_tz":-480},"id":"bZ7YU-LNf2Sn","outputId":"3637cd6d-26e0-4d3d-8536-d420ace5b550"},"outputs":[],"source":["random_index = 0\n","\n","fig = plt.figure()\n","ims = []\n","for i in range(TIME_STEPS):\n","    im = plt.imshow(np.array(imgs[i])[random_index], animated=True)\n","    ims.append([im])\n","    ims.append([im])\n","\n","animate = animation.ArtistAnimation(fig, ims, interval=5, blit=True, repeat_delay=1000)\n","animate.save('diffusion.gif')\n","plt.show()"]},{"cell_type":"markdown","metadata":{"id":"UJabZ9MEf2Sn"},"source":["## Your turn! 🚀\n","\n","Assignment - [Denoising difussion model](../assignments/deep-learning/difussion-model/denoising-difussion-model.ipynb)\n","\n","## Acknowledgments\n","\n","Thanks to [Kwangnam Yu](https://github.com/phykn) for creating the open-source project [diffusion_models_tutorial](https://github.com/phykn/diffusion_models_tutorial) and [kaggle](https://www.kaggle.com/) for creating the open-source course [Denoising Diffusion Models with TensorFlow](https://www.kaggle.com/code/folefac/denoising-diffusion-models-with-tensorflow#3.-The-Celeb-A-Dataset-%F0%9F%92%BE). They inspire the majority of the content in this chapter."]}],"metadata":{"accelerator":"GPU","colab":{"gpuType":"T4","provenance":[]},"kernelspec":{"display_name":"Python 3","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.9.18"}},"nbformat":4,"nbformat_minor":0}


Take the output push

Reply via ReviewNB

Lola-jo · 2024-04-27T05:03:28Z

There are too many files being pushed that are not part of the changes, make sure you pull the latest branch, upload it again and only upload the changes!

Lola-jo reviewed Apr 27, 2024

View reviewed changes

anyiyou11 force-pushed the dm1 branch from 6ea9338 to d55411e Compare May 5, 2024 16:31

update dm1

47148b1

anyiyou11 force-pushed the dm1 branch from d55411e to 47148b1 Compare May 5, 2024 16:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add something in diffusion model #751

add something in diffusion model #751

anyiyou11 commented Apr 21, 2024 •

edited

Loading

review-notebook-app bot commented Apr 21, 2024

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo commented Apr 27, 2024

add something in diffusion model #751

Are you sure you want to change the base?

add something in diffusion model #751

Conversation

anyiyou11 commented Apr 21, 2024 • edited Loading

Description

Type of change

How has this been tested?

Checklist:

review-notebook-app bot commented Apr 21, 2024

Lola-jo Apr 27, 2024 • edited Loading

Choose a reason for hiding this comment

Lola-jo Apr 27, 2024 • edited Loading

Choose a reason for hiding this comment

Lola-jo Apr 27, 2024 • edited Loading

Choose a reason for hiding this comment

Lola-jo Apr 27, 2024 • edited Loading

Choose a reason for hiding this comment

Lola-jo commented Apr 27, 2024

anyiyou11 commented Apr 21, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading

Lola-jo Apr 27, 2024 •

edited

Loading