Skip to content
Vladimir Mandic edited this page Feb 14, 2023 · 4 revisions

LoRA: Low-rank Adaptation

LoRA is a method used for quick fine-tuning of diffusion models - what you end up with after training is a much smaller 'model' that works with your other models
LoRA injects trainable layers to steer cross attention layers in multiple parts of the original

Originally introduced by Microsoft as way of tuning Large Language Models and later adopted for other use-cases
In case of Stable Diffusion, current LoRA implementations are capable of creating layers for actual diffusion model as well as unet de-noiser and text encoder

Currently most popular ways to train LoRA are:

Use

How to use LoRA? Simply add it to your prompt and optionally use any activation tags/keywords:

photo of "sara" in the city <lora:lora-sara:1.0>

Where

  • sara is activation tag set during model training
  • lora-sara is name of the LoRA model
  • 1.0 is activation strength

Train

Chosen method in this repository is Kohya's and repository is registered as a submodule in /modules/lora
However, solution is heavily wrapped in custom pre-processing and post-processing scripts to make it work with existing training workflow

cli/train-lora.py
Steps:

  • Pre-processed input images
  • Prepare captions and tags
  • Create metadata file
  • Create VAE normalization latents
  • Run actual training

Processing and training can be split into separate step allowing to batch-process and prepare multiple datasets before actual training For example,

  • run train-lora.py --notrain and it will run only processing steps
  • run train-lora.py --noprocess and it will run only training steps

Pre-processing is a highly complex and customizable process performed by cli/modules/process.py and includes number of optional operations (details)

Note that pre-processing requires WebUI Server to be running, as it uses existing models captioning, face restoration, etc.
However, that can cause memory issues as LoRA training is memory intensive

  • run train-lora.py --shutdown and it will auto-shutdown WebUI server after processing and before training

There are large number of additional tunable parameters (although much less than underlying solution as many values are predetermined based on best practices), but minimum that should be provided is:

  • --model MODEL: original model to use a base for training
  • --input INPUT: input folder with training images
  • --output OUTPUT: lora name
  • --tag TAG: primary tag word(s) that can be used for model activation in prompts
  • --dir DIR: folder containing lora checkpoints

Additionally, depending on your training dataset, you may want to adjust:

  • --steps STEPS: total number of training steps
    adjust based on size of your dataset as larger dataset requires more steps to train
  • --dim DIM: network dimension which is actual size of created LoRA
    this determines its capacity to learn and should be proportional to size and complexity of training dataset

Example:

train-lora.py --model /models/stable-diffusion/sd-v15-runwayml.ckpt --dir /models/lora --tag sara --output lora-sara --input sara-images/ --dim 192 --steps 20000

Clone this wiki locally