Replies: 6 comments 1 reply
-
Manual installation guide. OneTrainer installation on AMD GPU on Debian 12Prerequsites
|
Beta Was this translation helpful? Give feedback.
-
Looks like I've either run into the problem with GPU hardware (overheating? power?) or running into this/similar error on each use of the compute (inference or training): https://gitlab.freedesktop.org/mesa/mesa/-/issues/7504 GPU was stable for over a month (inference mostly, just getting into training) so suspect more of the hardware problem. |
Beta Was this translation helpful? Give feedback.
-
Ok, got it working again, this time on fedora 39. Procedure was basically the same but instead of venv had to use conda because system python is python v 3.12. Run into one issue with the Tkinter GUI -> it was basically unusable because no system fonts were available due to conda packaging: Fortunately this solution worked: ContinuumIO/anaconda-issues#6833 (comment) ContinuumIO/anaconda-issues#6833 (comment) Otherwise looks like everything works. Low GPU utilization is related to the GUI too, when exported script is run it utilizes GPU at close to 100%. |
Beta Was this translation helpful? Give feedback.
-
One more note, had to do this to get training running. Not sure if it's related to Linux, rocm, specific dataset or training options. Got errors that ScaleImage is not defined, replaced it with Upscale: --- a/modules/dataLoader/StableDiffusionBaseDataLoader.py
+++ b/modules/dataLoader/StableDiffusionBaseDataLoader.py
@@ -215 +215 @@ class StablDiffusionBaseDataLoader(BaseDataLoader):
- downscale_mask = ScaleImage(in_name='mask', out_name='latent_mask', factor=0.125)
+ downscale_mask = Upscale(in_name='mask', out_name='latent_mask', factor=0.125)
@@ -217 +217 @@ class StablDiffusionBaseDataLoader(BaseDataLoader):
- downscale_depth = ScaleImage(in_name='depth', out_name='latent_depth', factor=0.125)
+ downscale_depth = Upscale(in_name='depth', out_name='latent_depth', factor=0.125)
@@ -321 +321 @@ class StablDiffusionBaseDataLoader(BaseDataLoader):
- upscale_mask = ScaleImage(in_name='latent_mask', out_name='decoded_mask', factor=8)
+ upscale_mask = Upscale(in_name='latent_mask', out_name='decoded_mask', factor=8) Looks like the reason is here: So needed to upgrade pip packages. |
Beta Was this translation helpful? Give feedback.
-
@aa956 How does VRAM usage compare to kohya's sd-scripts? My experience with sd-scripts was SDP attention would OOM while the plain pytorch implementation of flash attention ( |
Beta Was this translation helpful? Give feedback.
-
If I recall correctly there were no significant differences in VRAM usage between kohya's scripts and OneTrainer. I've jumped the ship (replaced RX 6700 XT 12Gb with RTX 4060 Ti 16 Gb) so can't repeat unfortunately but there were no OOM-s with SD1.5 Lora training using 512px resolution batch sizes 1,2 and 4 if I recall correctly. But do sd-scripts support SDP at all? I see only --xformers or --mem_eff_attn in this doc: So I've used --mem_eff_attn with kohya's scripts. |
Beta Was this translation helpful? Give feedback.
-
Tried to install the OneTrainer on Debian 12 desktop with AMD RX 6700 XT GPU.
Got it working and now as the first SD1.5 LoRA training is running I've a few questions:
Beta Was this translation helpful? Give feedback.
All reactions