Merge branch 'master' of github.com:farleylai/notes

farleylai · Sep 15, 2022 · 541bc9a · 541bc9a
2 parents 6bcf9c9 + 61eb407
commit 541bc9a
Show file tree

Hide file tree

Showing 2 changed files with 29 additions and 196 deletions.
diff --git a/_notebooks/2022-07-29-generative.ipynb b/_notebooks/2022-07-29-generative.ipynb
@@ -4,8 +4,7 @@
   "metadata": {
     "colab": {
       "name": "2022-07-29-generative.ipynb",
-      "provenance": [],
-      "authorship_tag": "ABX9TyMW/NI7AfluzKqKSZtGsNEI"
+      "provenance": []
     },
     "kernelspec": {
       "name": "python3",
@@ -19,7 +18,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# Generative AI with Diffusion Models\n",
+        "# Generative Diffusion Models for Image Synthesis\n",
         "> Implication of memory mapped storage for super large data access.\n",
         "\n",
         "- hide: true\n",
@@ -35,10 +34,14 @@
     {
       "cell_type": "markdown",
       "source": [
-        "TL; DR\t\n",
-        "- SLURM monitors the total resident memory (RSS) consumed by all the task processes (incl. dataloader workers)\t\n",
-        "- `pin_memory=True` increases RSS significantly and may cause leaks with mmap based LMDB, pushing to the memory limit sooner\n",
-        "- PyTorch `FastDataLoader` or `DataLoader` created with `persistent_workers=True` is going to accumulate RSS with workers that never reset MMAP based storage such as `LMDB` env across epochs"
+        "TL; DR\n",
+        "\n",
+        "This post compares and highlights the progress in recent generative text-image diffusion models as follows:\n",
+        "\n",
+        "- Diffusion Models Beat GANs (OpenAI)\n",
+        "- GLIDE (OpenAI)\n",
+        "- DALLE·2 (OpenAI)\n",
+        "- Imagen (Google Brain)"
       ],
       "metadata": {
         "id": "xbz6IUa-1749"
@@ -243,4 +246,4 @@
       }
     }
   ]
-}
+}
diff --git a/_notebooks/2022-07-30-transformers.ipynb b/_notebooks/2022-07-30-transformers.ipynb
@@ -4,8 +4,7 @@
   "metadata": {
     "colab": {
       "name": "2022-07-30-transformers.ipynb",
-      "provenance": [],
-      "authorship_tag": "ABX9TyMW/NI7AfluzKqKSZtGsNEI"
+      "provenance": []
     },
     "kernelspec": {
       "name": "python3",
@@ -26,7 +25,7 @@
         "- toc: false\n",
         "- badges: true\n",
         "- comments: true\n",
-        "- categories: [blog, generative, diffusion, deep learning]"
+        "- categories: [blog, transformer, computer vision, deep learning, multimodal]"
       ],
       "metadata": {
         "id": "xBiIWit9fTB0"
@@ -35,10 +34,21 @@
     {
       "cell_type": "markdown",
       "source": [
-        "TL; DR\t\n",
-        "- SLURM monitors the total resident memory (RSS) consumed by all the task processes (incl. dataloader workers)\t\n",
-        "- `pin_memory=True` increases RSS significantly and may cause leaks with mmap based LMDB, pushing to the memory limit sooner\n",
-        "- PyTorch `FastDataLoader` or `DataLoader` created with `persistent_workers=True` is going to accumulate RSS with workers that never reset MMAP based storage such as `LMDB` env across epochs"
+        "TL; DR\n",
+        "\n",
+        "\n",
+        "This post goes through transformer based architectures in various novel applications.\n",
+        "\n",
+        "- Transformer for End-to-End Object Detection - 🔗 Zhu et al. (2021)\n",
+        "- Transformer for 3D Object Detection - 🔗 Bhattacharyya et al. (2021)\n",
+        "- Transformer for Multi-Object Tracking - 🔗 Sun et al. (2020)\n",
+        "- Transformer for Lane Shape Prediction - 🔗 Liu et al. (2020)\n",
+        "- Transformer for Vision-Language Modeling - 🔗 Zhang et al. (2021)\n",
+        "- Transformer for Image Synthesis - 🔗 Esser et al. (2020)\n",
+        "- Transformer for Music Generation - 🔗 Hsiao et al. (2021)\n",
+        "- Transformer for Dance Generation with Music - 🔗 Huang et al. (2021)\n",
+        "- Transformer for Point-Cloud Processing - 🔗 Guo et al. (2020)\n",
+        "- Transformer for Time-Series Forecasting - 🔗 Lim et al. (2020)"
       ],
       "metadata": {
         "id": "xbz6IUa-1749"
@@ -61,186 +71,6 @@
       "metadata": {
         "id": "nVWrtfItgu6z"
       }
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "import os\n",
-        "import sys\n",
-        "import random\n",
-        "import pytest\n",
-        "import torch\n",
-        "\n",
-        "from torch.utils.data import DataLoader\n",
-        "from time import time\n",
-        "from tqdm import tqdm\n",
-        "\n",
-        "print()\n",
-        "\n",
-        "KB = 2**10\n",
-        "MB = 2**10 * KB\n",
-        "GB = 2**10 * MB\n",
-        "\n",
-        "def rss_usage(breakdown=False):\n",
-        "    import psutil\n",
-        "    proc = psutil.Process(os.getpid())\n",
-        "    RSS = []\n",
-        "    RSS.append((os.getpid(), proc.memory_info().rss))\n",
-        "    for child in proc.children(recursive=True):\n",
-        "        RSS.append((child.pid, child.memory_info().rss))\n",
-        "    \n",
-        "    rss = sum(mem for _, mem in RSS)\n",
-        "    return (rss, RSS) if breakdown else rss\n",
-        "\n",
-        "def test_rss():\n",
-        "    print(sys.argv)\n",
-        "    argv = sys.argv\n",
-        "    sys.argv = [argv[1]]\n",
-        "    import lmdb\n",
-        "    from utils.utils import FastDataLoader\n",
-        "    from dataset.lmdb_dataset import UCF101LMDB_2CLIP\n",
-        "    from main_nce import parse_args, get_transform\n",
-        "    args = parse_args()\n",
-        "    sys.argv = argv\n",
-        "    lmdb_root = \"/mnt/ssd/dataset/ucf101/lmdb\"\n",
-        "    lmdb_path = f\"{lmdb_root}/UCF101/ucf101_frame.lmdb\"\n",
-        "    trans = get_transform('train', args)\n",
-        "    ucf101 = UCF101LMDB_2CLIP(db_path=lmdb_path, mode='train', transform=trans, num_frames=32, ds=1, return_label=True)\n",
-        "    print(f\"Created UCF101 2clip dataset of size {len(ucf101)}\")\n",
-        "\n",
-        "    dataloader = FastDataLoader(ucf101, \n",
-        "                            batch_size=32, shuffle=True,\n",
-        "                            num_workers=4, persistent_workers=False, \n",
-        "                            pin_memory=not True, sampler=None, drop_last=True)\n",
-        "    batches = 8\n",
-        "    for epoch in range(3):\n",
-        "        rss = rss_usage()\n",
-        "        print(f\"[e{epoch:02d}] RSS: {rss / GB:.2f} GB\")\n",
-        "        for idx, (input_seq, label) in tqdm(enumerate(dataloader), total=len(dataloader), disable=True):\n",
-        "            if idx % 4 == 0:\n",
-        "                rss, RSS = rss_usage(True)\n",
-        "                for pid, mem in RSS:\n",
-        "                    print(f\"[e{epoch:02d}][b{idx:02d}][{pid}] consumes {mem / GB:.2f} GB\")\n",
-        "                print(f\"[e{epoch:02d}][b{idx:02d}] RSS: {rss / GB:.2f} GB\")\n",
-        "            if idx == batches:\n",
-        "                break"
-      ],
-      "metadata": {
-        "id": "DZvUrWjofNHQ"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "[e00] RSS: 2.56 GB\n",
-        "[e00][b00][14023] consumes 1.08 GB\n",
-        "[e00][b00][14055] consumes 0.49 GB\n",
-        "[e00][b00][14071] consumes 0.81 GB\n",
-        "[e00][b00][14087] consumes 0.83 GB\n",
-        "[e00][b00][14103] consumes 0.84 GB\n",
-        "[e00][b00] RSS: 4.07 GB\n",
-        "[e00][b04][14023] consumes 1.08 GB\n",
-        "[e00][b04][14055] consumes 0.78 GB\n",
-        "[e00][b04][14071] consumes 0.90 GB\n",
-        "[e00][b04][14087] consumes 0.64 GB\n",
-        "[e00][b04][14103] consumes 0.80 GB\n",
-        "[e00][b04] RSS: 4.20 GB\n",
-        "[e00][b08][14023] consumes 1.08 GB\n",
-        "[e00][b08][14055] consumes 0.97 GB\n",
-        "[e00][b08][14071] consumes 1.00 GB\n",
-        "[e00][b08][14087] consumes 0.66 GB\n",
-        "[e00][b08][14103] consumes 1.24 GB\n",
-        "[e00][b08] RSS: 4.95 GB\n",
-        "[e01] RSS: 4.97 GB\n",
-        "[e01][b00][14023] consumes 1.08 GB\n",
-        "[e01][b00][14055] consumes 0.66 GB\n",
-        "[e01][b00][14071] consumes 0.92 GB\n",
-        "[e01][b00][14087] consumes 0.66 GB\n",
-        "[e01][b00][14103] consumes 0.80 GB\n",
-        "[e01][b00] RSS: 4.12 GB\n",
-        "[e01][b04][14023] consumes 1.08 GB\n",
-        "[e01][b04][14055] consumes 0.80 GB\n",
-        "[e01][b04][14071] consumes 0.99 GB\n",
-        "[e01][b04][14087] consumes 1.04 GB\n",
-        "[e01][b04][14103] consumes 1.03 GB\n",
-        "[e01][b04] RSS: 4.93 GB\n",
-        "[e01][b08][14023] consumes 1.08 GB\n",
-        "[e01][b08][14055] consumes 0.87 GB\n",
-        "[e01][b08][14071] consumes 1.05 GB\n",
-        "[e01][b08][14087] consumes 1.19 GB\n",
-        "[e01][b08][14103] consumes 1.07 GB\n",
-        "[e01][b08] RSS: 5.26 GB\n",
-        "[e02] RSS: 5.29 GB\n",
-        "[e02][b00][14023] consumes 1.08 GB\n",
-        "[e02][b00][14055] consumes 0.85 GB\n",
-        "[e02][b00][14071] consumes 1.06 GB\n",
-        "[e02][b00][14087] consumes 1.09 GB\n",
-        "[e02][b00][14103] consumes 1.09 GB\n",
-        "[e02][b00] RSS: 5.17 GB\n",
-        "[e02][b04][14023] consumes 1.08 GB\n",
-        "[e02][b04][14055] consumes 0.92 GB\n",
-        "[e02][b04][14071] consumes 1.12 GB\n",
-        "[e02][b04][14087] consumes 0.86 GB\n",
-        "[e02][b04][14103] consumes 1.14 GB\n",
-        "[e02][b04] RSS: 5.12 GB\n",
-        "[e02][b08][14023] consumes 1.08 GB\n",
-        "[e02][b08][14055] consumes 0.97 GB\n",
-        "[e02][b08][14071] consumes 1.19 GB\n",
-        "[e02][b08][14087] consumes 0.93 GB\n",
-        "[e02][b08][14103] consumes 1.23 GB\n",
-        "[e02][b08] RSS: 5.39 GB"
-      ],
-      "metadata": {
-        "id": "EM7WJme3qSgF"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "The root cause is the `LMDB` Python API to access database records as follows may not release the mapped memory timely on completion to reduce the runtime RSS."
-      ],
-      "metadata": {
-        "id": "7wVw3tkF_Jro"
-      }
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "class UCF101LMDB_2CLIP(object):\n",
-        "        ...\n",
-        "        print('Loading LMDB from %s, split:%d' % (self.db_path, self.which_split))\n",
-        "        self.env = lmdb.open(self.db_path, subdir=os.path.isdir(self.db_path),\n",
-        "                             readonly=True, lock=False,\n",
-        "                             readahead=False, meminit=False)\n",
-        "        ...\n",
-        "        \n",
-        "    def __getitem__(self, index):\n",
-        "        vpath, vlen, vlabel, vname = self.video_subset.iloc[index]\n",
-        "        env = self.env\n",
-        "        with env.begin(write=False) as txn:\n",
-        "            raw = msgpack.loads(txn.get(self.get_video_id[vname].encode('ascii')))"
-      ],
-      "metadata": {
-        "id": "tZIeNR97_4ky"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "Worse, the [FastLoader](https://bit.ly/3vvTXtj) never recreates dataset iterator workers that involes the `LMDB` env and will grow RSS over epochs due to increasing MMAP access.\n",
-        "If using the vanilla `DataLoader`, make sure to set `persistent_workers=False` in case of a similar memory leak.\n",
-        "Nonetheless, sufficient memory must be allocated at least for peak usage in one epoch.\n",
-        "This serves as the workaround."
-      ],
-      "metadata": {
-        "id": "9DiPzglnAA7j"
-      }
     }
   ]
-}
+}