From 8ace775a8409f074fbfdd852f57b6de6f6440191 Mon Sep 17 00:00:00 2001 From: Katherine Crowson Date: Mon, 22 Jan 2024 23:53:19 +0000 Subject: [PATCH] Update README for transformer-model-v2 --- README.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 0007c4d..76ad91c 100644 --- a/README.md +++ b/README.md @@ -4,11 +4,9 @@ An implementation of [Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) (Karras et al., 2022) for PyTorch, with enhancements and additional features, such as improved sampling algorithms and transformer-based diffusion models. -## Hourglass transformer experimental branch +## Hourglass diffusion transformer -**This branch is under active development. Models of the new type that are trained with it may stop working due to backward incompatible changes.** - -This branch of `k-diffusion` is for testing an experimental model type, `image_transformer_v2`, that uses ideas from [Hourglass Transformer](https://arxiv.org/abs/2110.13711) and [DiT](https://arxiv.org/abs/2212.09748). +`k-diffusion` contains a new model type, `image_transformer_v2`, that uses ideas from [Hourglass Transformer](https://arxiv.org/abs/2110.13711) and [DiT](https://arxiv.org/abs/2212.09748). ### Requirements @@ -18,7 +16,7 @@ To use the new model type you will need to install custom CUDA kernels: * [FlashAttention-2](https://github.com/Dao-AILab/flash-attention) for global attention. It will fall back to plain PyTorch if it is not installed. -Also, you should make sure your PyTorch installation is capable of using `torch.compile()` (for instance, if you are using Python 3.11, you should use a PyTorch nightly build instead of 2.0). It will fall back to eager mode if `torch.compile()` is not available, but it will be slower and use more memory in training. +Also, you should make sure your PyTorch installation is capable of using `torch.compile()`. It will fall back to eager mode if `torch.compile()` is not available, but it will be slower and use more memory in training. ### Usage @@ -76,6 +74,10 @@ In the `"model"` key of the config file: The window size at each level must evenly divide the image size at that level. Models trained with one attention type must be fine-tuned to be used with a different type. +#### Inference + +TODO: write this section + ## Installation `k-diffusion` can be installed via PyPI (`pip install k-diffusion`) but it will not include training and inference scripts, only library code that others can depend on. To run the training and inference scripts, clone this repository and run `pip install -e `.