Steps to run SD.Next with Intel Arc GPU on native windows (IPEX) #2023

Nuullll · 2023-08-17T14:09:46Z

Nuullll
Aug 17, 2023

Preparations

Install Intel GPU Driver properly (but not 4885 nor 4887, see Drivers from 4885 and newer break IPEX for native windows. intel/intel-extension-for-pytorch#442)
Disable your iGPU (if any, e.g. UHD or Iris Xe) in the device manager.
Install git.
Install python 3.10 (I only built wheels for python 3.10)

Starting from scratch

The latest IPEX wheels (built from source by myself, NOT an official release from INTEL) have been added to dev branch by @Disty0.

Arc users should be able to launch SD.Next directly without needing to manually install any extra dependencies!!

git clone https://github.com/vladmandic/automatic.git -b dev
cd automatic
.\webui.bat --use-ipex

Using python: "C:\Users\vfirs\AppData\Local\Programs\Python\Python310\python.exe"
Creating VENV: C:\Users\vfirs\projects\automatic\venv
Using VENV: C:\Users\vfirs\projects\automatic\venv
16:18:39-690418 INFO     Starting SD.Next
16:18:39-694466 INFO     Python 3.10.6 on Windows
16:18:39-793142 INFO     Version: app=sd.next updated=2023-10-14 hash=fa86cc0a
                         url=https://github.com/vladmandic/automatic.git/tree/dev
16:18:39-828555 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 183 Stepping 1, GenuineIntel system=Windows
                         release=Windows-10-10.0.22621-SP0 python=3.10.6
16:18:39-830556 INFO     Intel OneAPI Toolkit detected
16:18:39-831556 INFO     Installing package:
                         https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.0.110%2Bxpu-master%
                         2Bdll-bundle/torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl
                         https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.0.110%2Bxpu-master%
                         2Bdll-bundle/torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl
                         https://github.com/Nuullll/intel-extension-for-pytorch/releases/download/v2.0.110%2Bxpu-master%
                         2Bdll-bundle/intel_extension_for_pytorch-2.0.110+gitc6ea20b-cp310-cp310-win_amd64.whl
...

Performance

IPEX native windows is ~20% slower than WSL or Linux for low resolutions. See details by filtering "ipex" at https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html.

Why did I build IPEX from source (TL;DR)

Issue of current official IPEX wheel

Intel released IPEX wheels for native windows in Aug'23. However, the official IPEX wheel was built without AOT (Ahead Of Time) compilation support, which dramatically degrades the user experience for IPEX windows -- there's a JIT (Just-In-Time) compilation overhead in the order of 10 minutes (depending on your CPU) each time you start the Web UI, load model weights or change image generation parameter. See intel/intel-extension-for-pytorch#399 for details. In my opinion, this makes IPEX not usable on native Windows.

Building IPEX wheel from source with AOT support

To overcome the above issue, community folks (including me) started to build the IPEX wheel from source since Intel didn't provide a clear ETA for an official AOT wheel release.
The IPEX AOT wheel reduces the warmup overhead from ~10 minutes to ~10 seconds as expected! So I'd like to share the wheels that I built with Python 3.10 and Nuullll/intel-extension-for-pytorch@c6ea20b (which took ~4 hours for i9-13900 + 32GB RAM).

oneAPI dependencies

Previously it was common sense that one needed to install the oneAPI components (DPCPP compiler and MKL) system-wide to be able to use IPEX. Recently I found out that we can actually bake all dll dependencies into the python wheel, so that we don't have to manually install oneAPI any more. You can check repack_wheels.bat to see how I re-package the torch and ipex wheels.

Below is outdated

Install Miniconda (or Anaconda)

Download Miniconda3 - Python 3.10 installer for Windows: https://docs.conda.io/en/latest/miniconda.html#windows-installers
Run the installer with default settings which are good enough.

It's highly recommended to use a conda virtual environment instead of SD.Next's default virtualenv. We will need to install some dll dependencies (e.g., oneAPI components, libuv) and conda will automatically put the corresponding dlls into the library path.

Prepare conda virtual environment for SD.Next

Launch Anaconda Prompt (miniconda3). At this time you will be in the base environment.
Create virtual environment (for example, let's name it as sdnext)

conda create -n sdnext python=3.10

Activate the new environment

conda activate sdnext

Install dll dependencies in the conda environment (`sdnext`)

Install libuv

conda install -c conda-forge libuv=1.39

Otherwise, you may encounter the following error when trying to import torch later:

OSError: [WinError 126] The specified module could not be found.
Error loading "backend_with_compiler.dll" or one of its dependencies.

Install oneAPI dependencies from Intel channel

conda install -c intel dpcpp-cpp-rt mkl-dpcpp

These are the minimal requirements for running IPEX. And we don't even have to activate oneAPI environment with setvars.bat -- all required dlls will be put into the library path when we activate the sdnext conda environment.

Prepare SD.Next folder

git clone https://github.com/vladmandic/automatic.git
cd automatic

Then put all the downloaded wheels into the SD.Next root folder automatic\

dir *.whl

2023/09/19  13:13       289,085,958 intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl
2023/09/19  10:58       195,693,642 torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl
2023/09/19  10:58           767,270 torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl

Launch SD.Next with `launch.py` directly

Make sure you're in the sdnext conda environment.
Override the TORCH_COMMAND environment to let SD.Next install torch, torchvision and ipex from local wheels, then we're all set!

set TORCH_COMMAND=intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl
python launch.py --use-ipex

Logs:

14:27:00-037421 INFO     Starting SD.Next
14:27:00-040421 INFO     Python 3.10.13 on Windows
14:27:00-120083 INFO     Version: app=sd.next updated=2023-09-20 hash=89ba8e3c
                         url=https://github.com/vladmandic/automatic.git/tree/master
14:27:00-150084 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 183 Stepping 1, GenuineIntel system=Windows
                         release=Windows-10-10.0.22621-SP0 python=3.10.13
14:27:00-152591 INFO     Installing package: intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl
                         torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl
                         torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl
14:27:40-092035 WARNING  Modified files: ['intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl',
                         'torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl',
                         'torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl']
14:27:40-125035 INFO     Verifying requirements
14:27:40-136035 INFO     Installing package: addict
...
14:32:54-311604 INFO     Installing package: pydantic==1.10.11
14:32:57-556134 INFO     Verifying packages
14:32:57-557195 INFO     Installing package: git+https://github.com/openai/CLIP.git
14:33:08-962406 INFO     Installing package:
                         git+https://github.com/patrickvonplaten/invisible-watermark.git@remove_onnxruntime_depedency
14:33:23-234289 INFO     Installing package: onnxruntime==1.15.1
14:33:28-254064 INFO     Installing package: pi-heif
14:33:31-154576 INFO     Installing package: tensorflow==2.13.0
14:33:54-196918 INFO     Installing package: git+https://github.com/google-research/torchsde
14:34:09-675035 INFO     Verifying repositories
14:34:09-681033 INFO     Cloning repository: https://github.com/Stability-AI/stablediffusion.git
14:35:10-745741 INFO     Cloning repository: https://github.com/CompVis/taming-transformers.git
14:44:37-205830 INFO     Cloning repository: https://github.com/crowsonkb/k-diffusion.git
14:44:39-612037 INFO     Cloning repository: https://github.com/sczhou/CodeFormer.git
14:44:48-312887 INFO     Cloning repository: https://github.com/salesforce/BLIP.git
14:44:53-258388 INFO     Verifying submodules
14:46:57-260705 INFO     Extension installed packages: clip-interrogator-ext ['clip-interrogator==0.6.0']
14:47:06-490790 INFO     Extension installed packages: sd-webui-agent-scheduler ['SQLAlchemy==2.0.21',
                         'greenlet==2.0.2']
14:47:59-711550 INFO     Extension installed packages: sd-webui-controlnet ['cffi==1.15.1', 'cssselect2==0.7.0',
                         'fvcore==0.1.5.post20221221', 'iopath==0.1.9', 'lxml==4.9.3', 'mediapipe==0.10.5',
                         'opencv-contrib-python==4.8.0.76', 'portalocker==2.8.2', 'pycparser==2.21', 'reportlab==4.0.5',
                         'sounddevice==0.4.6', 'svglib==1.5.1', 'tabulate==0.9.0', 'tinycss2==1.2.1',
                         'webencodings==0.5.1', 'yacs==0.1.8']
14:48:02-553208 INFO     Extension installed packages: stable-diffusion-webui-images-browser ['Send2Trash==1.8.2']
14:48:12-526532 INFO     Extension installed packages: stable-diffusion-webui-rembg ['PyMatting==1.1.8', 'pooch==1.7.0',
                         'rembg==2.0.50']
14:48:12-574501 INFO     Extensions enabled: ['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora',
                         'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'stable-diffusion-webui-images-browser',
                         'stable-diffusion-webui-rembg', 'SwinIR']
14:48:12-575500 INFO     Verifying packages
14:48:12-641880 INFO     Extension preload: {'extensions-builtin': 0.04, 'extensions': 0.0}
14:48:12-643830 INFO     Command line args: ['--use-ipex'] use_ipex=True
C:\Users\vfirs\miniconda3\envs\sdnext-test\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
14:48:19-854631 INFO     Engine: backend=Backend.ORIGINAL compute=ipex mode=no_grad device=xpu
14:48:19-857630 INFO     Device: device=Intel(R) Arc(TM) A770 Graphics n=1 ipex=2.0.110+git0f2597b
14:48:20-644974 INFO     Available VAEs: models\VAE items=0
14:48:20-646974 INFO     Available models: models\Stable-diffusion items=0 time=0.00s
Download the default model? (y/N) n
14:48:28-905775 INFO     Extension: script='extensions-builtin\sd-webui-controlnet\scripts\controlnet.py' Warning:
                         ControlNet failed to load SGM - will use LDM instead.
14:48:28-906775 INFO     Extension: script='extensions-builtin\sd-webui-controlnet\scripts\controlnet.py' ControlNet
                         preprocessor location:
                         C:\Users\vfirs\projects\automatic\extensions-builtin\sd-webui-controlnet\annotator\downloads
14:48:28-914797 INFO     Extension: script='extensions-builtin\sd-webui-controlnet\scripts\hook.py' Warning: ControlNet
                         failed to load SGM - will use LDM instead.
14:48:29-046922 INFO     Extension:
                         script='extensions-builtin\stable-diffusion-webui-images-browser\scripts\image_browser.py'
                         Image Browser: Creating database
14:48:29-048923 INFO     Extension:
                         script='extensions-builtin\stable-diffusion-webui-images-browser\scripts\image_browser.py'
                         Image Browser: Database created
14:48:45-849947 INFO     Extensions time: 19.01s { automatic=0.06s a1111-sd-webui-lycoris=0.84s LDSR=0.05s Lora=0.18s
                         sd-webui-agent-scheduler=0.35s sd-webui-controlnet=0.47s
                         stable-diffusion-webui-images-browser=0.12s stable-diffusion-webui-rembg=16.70s }
14:48:51-994103 INFO     Loading UI theme: name=black-teal style=Auto
14:48:53-634943 INFO     Themes: builtin=6 default=5 external=54
14:48:53-635942 INFO     Themes: builtin=6 default=5 external=54
14:48:54-925924 INFO     Local URL: http://127.0.0.1:7860/
14:48:54-928924 INFO     Initializing middleware
14:48:55-062924 INFO     [AgentScheduler] Task queue is empty
14:48:55-064924 INFO     [AgentScheduler] Registering APIs
14:48:55-158939 ERROR    Cannot run without a checkpoint
14:48:55-162938 ERROR    Use --ckpt <path-to-checkpoint> to force using existing checkpoint
14:48:55-403138 INFO     Startup time: 42.70s { torch=5.00s gradio=1.15s diffusers=0.35s libraries=1.45s
                         extensions=19.01s models=5.95s codeformer=0.23s onchange=6.12s ui-txt2img=0.09s
                         ui-img2img=0.08s ui-settings=1.71s ui-extensions=0.76s launch=0.19s api=0.05s app-started=0.17s
                         checkpoint=0.25s }

Re-launch SD.Next later

Simply launch an Anaconda Prompt (miniconda3). Then activate the sdnext conda environment and execute launch.py with --use-ipex.

conda activate sdnext
python launch.py --use-ipex

You may notice the following false alarm:

14:52:04-805276 INFO     Intel OneAPI Toolkit detected
14:52:04-809277 ERROR    Intel OneAPI Toolkit is not activated! Activate OneAPI manually!

The reason is that there's no sycl-ls executable (it's not necessary at all) inside the conda environment, but we indeed have all the required oneAPI libraries. So just ignore it. I'll file a PR to remove this false alarm.

vladmandic · 2023-08-17T17:30:25Z

vladmandic
Aug 17, 2023
Maintainer

Thanks! Btw, this should probably be a Wiki page?

2 replies

Nuullll Aug 18, 2023
Author

Sure, will move it to wiki!

vladmandic Aug 18, 2023
Maintainer

btw, ping me or @Disty0 to commit to wiki - i had to limit write access to collaborators only as there were some bad updates by anonymous users.

kriegmaster56 · 2023-08-21T12:25:17Z

kriegmaster56
Aug 21, 2023

to tackle the tbb version probleme instead of opening any cmd shell you can open the terminal named "Intel oneAPI command prompt for Intel 64 for Visual Studio 2022", i think it is created when you install the VS integration part of OneApi . it will automaticaly source the setvars and apparently do a little more than that.

Now i'm facing no error or warning in the sd.next log but it does take a awfull lot of time to load a model and more specificaly the embedding part of the checkpoint up to 100 secondes just for that. On wsl2 it only take a few seconds.

After it ultimately load and is finally ready it won't start generating anything. it juste hang there writing nothing in the log file.

here is the sdnext logs and the python packages installed.
sdnext.log
python_packages.txt

3 replies

Nuullll Aug 21, 2023
Author

to tackle the tbb version probleme instead of opening any cmd shell you can open the terminal named "Intel oneAPI command prompt for Intel 64 for Visual Studio 2022", i think it is created when you install the VS integration part of OneApi . it will automaticaly source the setvars and apparently do a little more than that.

Thanks for the info. I'll check your approach.

Now i'm facing no error or warning in the sd.next log but it does take a awfull lot of time to load a model and more specificaly the embedding part of the checkpoint up to 100 secondes just for that. On wsl2 it only take a few seconds.
After it ultimately load and is finally ready it won't start generating anything. it juste hang there writing nothing in the log file.

Yes, startup and first-time inference (whenever your generation parameters change) is dramatically slow for IPEX on native windows, because the Intel-released IPEX windows wheels require further JIT (Just-In-Time) compilation before the actual computation. The linux wheels are pre-built AOT (Ahead-Of-Time) binaries. See intel/intel-extension-for-pytorch#399

You might need to wait a little longer to see the first image output. For your reference, with Arc A770 + i9-13900:

Web UI startup + model loading took 211 seconds
The first inference (512x512) took ~6 minutes
Afterwards inference (same settings) speed is around 6it/s

farrukhpitafi Aug 21, 2023

Good to know, I thought I was the only one with this issue. I'll stick to WSL2 for now. Thanks for your hardwork, keep it up

kriegmaster56 Aug 21, 2023

wow 6 mins to start. I never would have find out because i never would have waited that long without anything showing it's working. thank for the intel

Nuullll · 2023-09-20T01:17:26Z

Nuullll
Sep 20, 2023
Author

I'm gonna to rewrite this tutorial (probably this week).

Fun facts:

One can install all required oneAPI dependencies via intel conda channel instead, so that we don't have to install oneAPI for the whole system.
I've built the IPEX AOT wheel from source by myself (since there's no clear ETA from Intel: [IPEX][XPU][Windows 11] It takes forever to run the first pass intel/intel-extension-for-pytorch#399). The AOT wheel reduces the warmup overhead from ~10 minutes to ~10 seconds on native windows. However, the IPEX native windows performance is still ~20% slower than WSL/linux (see details by filtering "ipex" at https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html). I'll upload the built AOT wheel when I rewrite the tutorial, and please use it at your own risk since it is NOT an official release from Intel.

0 replies

Nuullll · 2023-09-24T07:07:48Z

Nuullll
Sep 24, 2023
Author

Updated (rewrited) the tutorial. I think this would be a much simpler setup than the old one.
I'd like to hear feedback from Arc users (whether it works or not) for some time.
If most of the responses are positive, I'd bother @Disty0 or @vladmandic to add this to wiki then :-P

7 replies

vladmandic Sep 25, 2023
Maintainer

@kunumigab create an issue, don't comment on unrelated post (just because it talks about ipex).

Disty0 Sep 26, 2023
Collaborator

@vladmandic what do you think about using @Nuullll's IPEX build instead of Intel's?
We can also remove the IPEX workaround in installer.py like this.
I can upload @Nuullll's builds on my fork as a release and use Github as hosting.

vladmandic Sep 26, 2023
Maintainer

totally not against it

Disty0 Sep 26, 2023
Collaborator

Added them in dev branch: 2f17cef

vladmandic Sep 26, 2023
Maintainer

i guess the wiki should be simpler then :)

Troncomus · 2023-09-26T01:35:48Z

Troncomus
Sep 26, 2023

I'm sorry if I'm asking something quite basic, but I've not found a clear answer elsewhere. I've followed the directions as closely as possible, but I don't know what to do in this step: "Override the TORCH_COMMAND environment to let SD.Next install torch, torchvision and ipex from local wheels".

I've downloaded the files in the correct folder, but I don't know wich file do I have to modify to override the settings.

5 replies

Nuullll Sep 26, 2023
Author

Just execute the following line in the command line (with sdnext conda environment activated):

set TORCH_COMMAND=intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl

which will set the environment variable TORCH_COMMAND. Then

python launch.py --use-ipex

Troncomus Sep 26, 2023

Thank you for the response, that was the first thing I tried, but I got an error message with the following:

"(sdnext) PS C:\WINDOWS\system32\automatic> set TORCH_COMMAND=intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl
Set-Variable : No se encuentra ningún parámetro de posición que acepte el argumento
'torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl'.
En línea: 1 Carácter: 1

set TORCH_COMMAND=intel_extension_for_pytorch-2.0.110+git0f2597b-cp31 ...

  + CategoryInfo          : InvalidArgument: (:) [Set-Variable], ParameterBindingException
  + FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.SetVariableCommand"

Any idea of what might be wrong? I put the downloaded files in the right folder, and I executed the command within the conda environment, every other step went smoothly.

Troncomus Sep 26, 2023

I finally managed to get it right, I needed to use QUOTATION MARKS ->

set "TORCH_COMMAND=intel_extension_for_pytorch-2.0.110+git0f2597b-cp310-cp310-win_amd64.whl torch-2.0.0a0+gite9ebda2-cp310-cp310-win_amd64.whl torchvision-0.15.2a0+fa99a53-cp310-cp310-win_amd64.whl"

And it worked just as told.

TotalDay Sep 29, 2023

Can you share tutorial for Windows user?
Intel ARC A750

Disty0 Sep 29, 2023
Collaborator

@TotalDay this thread is the tutorial, just scroll up.

Whackjob · 2023-10-07T17:10:52Z

Whackjob
Oct 7, 2023

I'm in a weird place. I can't get this to work, though I suspect user error somewhere. I did follow the instructions step-by-step. It installed everything without error. I'm using Linux Mint with an Arc770 16GB by the way. I've slapped that button to make sure the settings are all default, just so we can start with a baseline. Now, if I load an XL model and try to run one 1024x1024 image, the error I get is "RuntimeError: could not create an engine". The traceback call above it gives me this:

/media/whackjob/16Tons/STABLE DIFFUSION/automatic/modules/call_queue.py:34 │
in f │
│
33 │ │ │ try: │
❱ 34 │ │ │ │ res = func(*args, **kwargs) │
35 │ │ │ │ progress.record_results(id_task, res) │
│
/media/whackjob/16Tons/STABLE DIFFUSION/automatic/modules/txt2img.py:66 in │
txt2img │
│
65 │ if processed is None: │
❱ 66 │ │ processed = processing.process_images(p) │
67 │ p.close() │
│
/media/whackjob/16Tons/STABLE DIFFUSION/automatic/modules/processing.py:626 │
in process_images │
│
625 │ │ else: │
❱ 626 │ │ │ res = process_images_inner(p) │
627 │ finally: │
│
/media/whackjob/16Tons/STABLE DIFFUSION/automatic/modules/processing.py:785 │
in process_images_inner │
│
784 │ │ │ │ from modules.processing_diffusers import process_diff │
❱ 785 │ │ │ │ x_samples_ddim = process_diffusers(p, p.seeds, p.prom │
786 │
│
/media/whackjob/16Tons/STABLE │
DIFFUSION/automatic/modules/processing_diffusers.py:358 in process_diffusers │
│
357 │ try: │
❱ 358 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-ca │
359 │ except AssertionError as e: │
│
... 13 frames hidden ... │
│
/home/whackjob/.local/lib/python3.10/site-packages/torch/nn/modules/module.p │
y:1501 in _call_impl │
│
1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks │
❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
1502 │ │ # Do not call functions when jit is used │
│
/media/whackjob/16Tons/STABLE │
DIFFUSION/automatic/extensions-builtin/Lora/lora.py:416 in │
lora_Linear_forward │
│
415 │ │
❱ 416 │ return torch.nn.Linear_forward_before_lora(self, input) │
417 │
│
/media/whackjob/16Tons/STABLE │
DIFFUSION/automatic/modules/sd_hijack_utils.py:17 in │
│
16 │ │ │ orig_func = getattr(resolved_obj, func_path[-1]) │
❱ 17 │ │ │ setattr(resolved_obj, func_path[-1], lambda *args, **kwargs │
18 │ │ self.init(orig_func, sub_func, cond_func) │
│
/media/whackjob/16Tons/STABLE │
DIFFUSION/automatic/modules/sd_hijack_utils.py:28 in call │
│
27 │ │ else: │
❱ 28 │ │ │ return self.__orig_func(*args, **kwargs) │
29 │
│
/home/whackjob/.local/lib/python3.10/site-packages/torch/nn/modules/linear.p │
y:114 in forward │
│
113 │ def forward(self, input: Tensor) -> Tensor: │
❱ 114 │ │ return F.linear(input, self.weight, self.bias) │
115 `
I would be very grateful for any assistance. I've tried googling and fiddling with things for a day or two now, without success.

36 replies

Disty0 Oct 17, 2023
Collaborator

You've installed IPEX for CPU. Use IPEX XPU for GPU support:

pip install --upgrade torch==2.0.1a0 torchvision==0.15.2a0 intel_extension_for_pytorch==2.0.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

Removing a docker container doesn't remove the passed folders or volumes.
Otherwise removing it will also remove the model folder you've passed.
Manually delete everything after removing it if you want to wipe everything.

Disty0 Oct 17, 2023
Collaborator

Also how did you end up with IPEX 1.13?
SDNext will install IPEX 2.0 XPU, not something else.

Whackjob Oct 18, 2023

Brother, I wish I knew. I remember installing 2.0. Well, trying it again. Getting an error.

Traceback (most recent call last):
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/cli/base_command.py", line 180, in exc_logging_wrapper
    status = run_func(*args)
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/cli/req_command.py", line 248, in wrapper
    return func(self, options, args)
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/commands/install.py", line 452, in run
    installed = install_given_reqs(
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/req/__init__.py", line 72, in install_given_reqs
    requirement.install(
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/req/req_install.py", line 807, in install
    install_wheel(
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/operations/install/wheel.py", line 731, in install_wheel
    _install_wheel(
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/operations/install/wheel.py", line 591, in _install_wheel
    file.save()
  File "/home/whackjob/.local/lib/python3.9/site-packages/pip/_internal/operations/install/wheel.py", line 390, in save
    shutil.copyfileobj(f, dest)
  File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/shutil.py", line 205, in copyfileobj
    buf = fsrc_read(length)
  File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/zipfile.py", line 924, in read
    data = self._read1(n)
  File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/zipfile.py", line 1014, in _read1
    self._update_crc(data)
  File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/zipfile.py", line 942, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'intel_extension_for_pytorch/lib/libintel-ext-pt-gpu.so'

I tried this:

whackjob@WhackjobONE:/media/whackjob/16Tons/stable-diffusion/automatic$ pip uninstall intel_extension_for_pytorch
Found existing installation: intel-extension-for-pytorch 2.0.100
Uninstalling intel-extension-for-pytorch-2.0.100:
  Would remove:
    /home/whackjob/.local/bin/ipexrun
    /home/whackjob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch-2.0.100.dist-info/*
    /home/whackjob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/*
Proceed (Y/n)? Y
  Successfully uninstalled intel-extension-for-pytorch-2.0.100

Uh, okay. It was 2.0? So I try running that install command again, this one:

pip install --upgrade torch==2.0.1a0 torchvision==0.15.2a0 intel_extension_for_pytorch==2.0.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

And I get the same error as that first block above. So I try to run SD again to confirm I'm screwed before I give up for the night and go do my college schoolwork. And... it runs O_o

Still a few errors as it goes:

ModuleNotFoundError: No module named 'clip_interrogator'
ModuleNotFoundError: No module named 'sqlalchemy'
ModuleNotFoundError: No module named 'rembg'

A lot in between of course, but I think that's the important bits. So I pip install 'em. Weirdly, I still get them. And if I try to generate a 1024x1024 image, I get

RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY)

I'm so close...! I'll keep working on it... if you've got advice, please let me know. I'm also waiting on an ETH delivery for your tip. I doubt we'd ever meet, but I'll buy you a cup of coffee. What's that saying, in Turkiye? One cup of coffee, friend for forty years?

Whackjob Oct 18, 2023

I have now eliminated all errors when SD.Next starts up. I can get in and start generating stuff. Yay! But then it fails at the end of the first generation and then immediately every attempt thereafter. Hrm? Giant wall of code, but it ends with this:

│ /home/whackjob/.local/lib/python3.10/site-packages/diffusers/models/resnet.py:639 in forward                                 │
│                                                                                                                              │
│   638 │   │                                                                                                                  │
│ ❱ 639 │   │   output_tensor = (input_tensor + hidden_states) / self.output_scale_factor                                      │
│   640                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY)

I figure, maybe it's a settings thing. I'm googling that error and finding almost nothing. In the UI it's showing this:

Native API failed. Native API returns: -6 (PI_ERROR_OUT_OF_HOST_MEMORY) -6 (PI_ERROR_OUT_OF_HOST_MEMORY)
Time: 42.36s | **GPU active 17040 MB reserved 17040** | used 4543 MB free 11746 MB total 16288 MB | retries 0 oom 0

So why does one image at 1024x1024 immediately eat all my GPU memory? I do not understand. Still working on it... advice welcome of course.

Disty0 Oct 18, 2023
Collaborator

Remove config.json. It's probably using CPU settings right now.

Also;
IPEX 2.0.100 = CPU
IPEX 2.0.110+xpu = GPU

Nuullll · 2023-10-13T12:03:59Z

Nuullll
Oct 13, 2023
Author

Updated (edited the description, just scroll up):

I baked all the dll dependencies (uv.dll and oneAPI dlls) into the torch and intel_extension_for_pytorch wheels: https://github.com/Nuullll/intel-extension-for-pytorch/releases/tag/v2.0.110%2Bxpu-master%2Bdll-bundle

Users should be able to simply launch SD.Next (dev branch for now) with --use-ipex. All IPEX dependencies and oneAPI dll files would be installed automatically into the venv folder.

3 replies

Disty0 Oct 13, 2023
Collaborator

Updated wheels in dev branch with these, thank you for the guide and wheels.
I will copy this guide to wiki after dev merges to master.

Nuullll Oct 14, 2023
Author

@Disty0 Actually I don't think we need to add the guide to wiki any more, as it's simple enough (.\webui.bat --use-ipex).

Disty0 Oct 17, 2023
Collaborator

Added to wiki:
https://github.com/vladmandic/automatic/wiki/Intel-ARC#windows-installation

Whackjob · 2023-10-31T11:43:22Z

Whackjob
Oct 31, 2023

Well, I'm back to being down. Now I get this:

ImportError: libze_loader.so.1: cannot open shared object file: No such file or directory

3 replies

Disty0 Oct 31, 2023
Collaborator

That should have been installed with Intel Level Zero GPU drivers. Install necessary packages in the Wiki page for Intel ARC

Whackjob Nov 1, 2023

And here it is! Trying to run
sudo apt-get install intel-opencl-icd intel-level-zero-gpu level-zero intel-media-va-driver-non-free libmfx1 libgl-dev intel-oneapi-compiler-dpcpp-cpp intel-oneapi-mkl python3-pip python3-venv git unzip libjemalloc-dev

gives me this:

Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 intel-level-zero-gpu : Depends: libigdgmm12 (>= 22.3.7) but 22.1.2+ds1-1 is to be installed
 intel-media-va-driver-non-free : Depends: libva-driver-abi-1.19
                                  Depends: libigdgmm12 (>= 22.3.7) but 22.1.2+ds1-1 is to be installed
 intel-opencl-icd : Depends: libigdgmm12 (>= 22.3.7) but 22.1.2+ds1-1 is to be installed
 libmfx1 : Depends: libva2 (>= 2.17) but 2.14.0-1 is to be installed
E: Unable to correct problems, you have held broken packages.

There's the culprit. However, I can't seem to find any broken packages. Synaptic package manager filtered for broken packages finds nothing. "sudo apt update --fix-missing" seems to do nothing. I'm really at a loss, now!

EDIT: I may have blundered my way into a fix.
EDIT2: Nope, fixed all starting up errors, now I get this.
AttributeError: 'StableDiffusionXLPipeline' object has no attribute 'alphas_cumprod

EDIT3: Figured it might just be a setting. Tweaked that around a bit, now I get this:

00:14:29-488978 INFO     Available models: path="/media/whackjob/16Tons/stable-diffusion/models/Stable-diffusion" items=31 time=0.00s                                                                      
Segmentation fault (core dumped)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/whackjob/.local/bin/ipexrun:8 in <module>                                                  │
│                                                                                                  │
│   5 from intel_extension_for_pytorch.launcher import main                                        │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ /home/whackjob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/launcher.py:109 in │
│ main                                                                                             │
│                                                                                                  │
│   106 │   │   sys.argv.insert(1, "cpu")                                                          │
│   107 │   args = parser.parse_args()                                                             │
│   108 │   if args.backend == "cpu":                                                              │
│ ❱ 109 │   │   cpu_run_main_with_args(args)                                                       │
│   110 │   elif args.backend == "xpu":                                                            │
│   111 │   │   xpu_run_main_with_args(args)                                                       │
│   112 │   else:                                                                                  │
│                                                                                                  │
│ /home/whackjob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/cpu/launch/launch. │
│ py:453 in run_main_with_args                                                                     │
│                                                                                                  │
│   450 │   else:                                                                                  │
│   451 │   │   launcher = launcher_multi_instances                                                │
│   452 │                                                                                          │
│ ❱ 453 │   launcher.launch(args)                                                                  │
│   454 │   for x in sorted(set(os.environ.keys()) - env_before):                                  │
│   455 │   │   logger.debug(f"{x}={os.environ[x]}")                                               │
│   456                                                                                            │
│                                                                                                  │
│ /home/whackjob/.local/lib/python3.9/site-packages/intel_extension_for_pytorch/cpu/launch/launche │
│ r_multi_instances.py:316 in launch                                                               │
│                                                                                                  │
│   313 │   │   │   │   p = process["process"]                                                     │
│   314 │   │   │   │   p.wait()                                                                   │
│   315 │   │   │   │   if p.returncode != 0:                                                      │
│ ❱ 316 │   │   │   │   │   raise subprocess.CalledProcessError(                                   │
│   317 │   │   │   │   │   │   returncode=p.returncode, cmd=process["cmd"]                        │
│   318 │   │   │   │   │   )                                                                      │
│   319 │   │   finally:                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command 'taskset -c 0-5 /opt/intel/oneapi/intelpython/latest/bin/python3 -u launch.py --use-ipex' returned non-zero exit status 139.

Frustrating!! I feel like I've been fighting for weeks to get this working and stable. Can't do it here. Can't keep your docker version up, because when it crashes and I need to recreate the container, it tells me the '.' container isn't empty. Even after I delete every folder the container used to be associated with. Even tried lazydocker to see what I was doing wrong. I've done memory tests and pulled some ram that did have bad addresses. The rest is OK. I've reinstalled my OS. I'm even trying a different linux OS.

I've spent so much time.

Whackjob Nov 2, 2023

Never mind! Disty0, I figured out why I was getting that error with your docker image. And I've fixed it!

Long story short, what I was doing was, I had a set of folders moved out from an old version of Automatic1111 that had my models in it. What I was doing was mounting that folder in place of the set of folders that automatic puts in place for models. So what was happening, was, it would work the first time, and then fail every time after, because that folder now wasn't empty! So instead of mounting it in sd-server/models, now I have it going to sd-server/MODEL. Completely different!

Now it all spools up beautifully. Oh, and the hilariously named alphas_cumprod error? Well, it was trying to shove an XL model through a regular SD pipeline, even through I had XL selected. I had to move it off, and then back again, in order for it to "take". So, unless my curse is refreshed, I'm good for a bit!

Next up, lol, ss_koyha, for training. Eventually.

Steps to run SD.Next with Intel Arc GPU on native windows (IPEX) #2023

Preparations

Starting from scratch

Performance

Why did I build IPEX from source (TL;DR)

Issue of current official IPEX wheel

Building IPEX wheel from source with AOT support

oneAPI dependencies

Below is outdated

Install Miniconda (or Anaconda)

Prepare conda virtual environment for SD.Next

Install dll dependencies in the conda environment (sdnext)

Prepare SD.Next folder

Launch SD.Next with launch.py directly

Re-launch SD.Next later

Replies: 8 comments · 59 replies

vladmandic Aug 17, 2023 Maintainer

Nuullll Aug 18, 2023 Author

vladmandic Aug 18, 2023 Maintainer

Nuullll Aug 21, 2023 Author

Nuullll Sep 20, 2023 Author

Nuullll Sep 24, 2023 Author

vladmandic Sep 25, 2023 Maintainer

Disty0 Sep 26, 2023 Collaborator

vladmandic Sep 26, 2023 Maintainer

Disty0 Sep 26, 2023 Collaborator

vladmandic Sep 26, 2023 Maintainer

Nuullll Sep 26, 2023 Author

Disty0 Sep 29, 2023 Collaborator

Disty0 Oct 17, 2023 Collaborator

Disty0 Oct 17, 2023 Collaborator

Disty0 Oct 18, 2023 Collaborator

Nuullll Oct 13, 2023 Author

Disty0 Oct 13, 2023 Collaborator

Nuullll Oct 14, 2023 Author

Disty0 Oct 17, 2023 Collaborator

Disty0 Oct 31, 2023 Collaborator

Install dll dependencies in the conda environment (`sdnext`)

Launch SD.Next with `launch.py` directly

Replies: 8 comments 59 replies

vladmandic
Aug 17, 2023
Maintainer

Nuullll Aug 18, 2023
Author

vladmandic Aug 18, 2023
Maintainer

Nuullll Aug 21, 2023
Author

Nuullll
Sep 20, 2023
Author

Nuullll
Sep 24, 2023
Author

vladmandic Sep 25, 2023
Maintainer

Disty0 Sep 26, 2023
Collaborator

vladmandic Sep 26, 2023
Maintainer

Disty0 Sep 26, 2023
Collaborator

vladmandic Sep 26, 2023
Maintainer

Nuullll Sep 26, 2023
Author

Disty0 Sep 29, 2023
Collaborator

Disty0 Oct 17, 2023
Collaborator

Disty0 Oct 17, 2023
Collaborator

Disty0 Oct 18, 2023
Collaborator

Nuullll
Oct 13, 2023
Author

Disty0 Oct 13, 2023
Collaborator

Nuullll Oct 14, 2023
Author

Disty0 Oct 17, 2023
Collaborator

Disty0 Oct 31, 2023
Collaborator