Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow speed on Windows #10

Open
SoftologyPro opened this issue Nov 18, 2024 · 14 comments
Open

Slow speed on Windows #10

SoftologyPro opened this issue Nov 18, 2024 · 14 comments

Comments

@SoftologyPro
Copy link

How fast is this supposed to generate the OBJ vertex points? I have it installed locally (Windows with a 24GB 4090), the gradio starts, and when I prompt it the vertex generation seems to take around 10 seconds per line/vertex.

Is this normal? Any tps to speed it up?

Thanks.

@oursland
Copy link
Contributor

That's very slow and suspect it is not using your GPU. On my system (Apple MBP M2 Max with 96 GiB RAM), memory usage at 4096 token context length is 15.16 GiB, which would fit entirely within your 24 GiB 4090.

@SoftologyPro
Copy link
Author

SoftologyPro commented Nov 18, 2024

I did install the appropriate GPU torch and Task Manager shows it is the GPU and not the CPU being used. Task Manager also shows dedicated GPU memory is 21.9/24.0 so not maxed out there.

For the install I basically use these pip commands to get the requirements, gradio, and then swap CPU torch out for GPU torch.

pip install -r requirements.txt
pip install gradio
pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==2.4.1+cu121 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

@SoftologyPro
Copy link
Author

SoftologyPro commented Nov 18, 2024

For any other WIndows users (or to help test this issue) here is an install.bat and run.bat. Save them both to an empty directory, command prompt into that directory, run install.bat, then run run.bat to start it.

install.bat

@echo off

echo *** %time% *** Deleting LLaMa-Mesh directory if it exists
if exist LLaMa-Mesh\. rd /S /Q LLaMa-Mesh

echo *** %time% *** Cloning LLaMa-Mesh repository
git clone https://github.com/nv-tlabs/LLaMa-Mesh
cd LLaMa-Mesh

echo *** %time% *** Creating venv
python -m venv venv

echo *** %time% *** Activating venv
call venv\scripts\activate.bat

echo *** %time% *** Installing requirements
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install gradio

echo *** %time% *** Installing GPU torch
pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==2.4.1+cu121 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

call venv\scripts\deactivate.bat
cd ..
echo *** %time% *** Finished LLaMa-Mesh install
echo.
echo Check the stats for any errors.  Do not assume it worked.
pause

run.bat

@echo off
cd LLaMa-Mesh
call venv\scripts\activate.bat
python app.py
call venv\scripts\deactivate.bat
cd..

@SoftologyPro
Copy link
Author

After well over an hour processing it did finish, but this was the result for "Create a 3D mesh of a ginger and white kitten dancing wearing a tutu"

image

@SoftologyPro
Copy link
Author

SoftologyPro commented Nov 19, 2024

Testing the first example prompt gives this error after clicking it

Traceback (most recent call last):
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\queueing.py", line 624, in process_events
    response = await route_utils.call_process_api(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\blocks.py", line 2015, in process_api
    result = await self.call_function(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\blocks.py", line 1574, in call_function
    prediction = await utils.async_iteration(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 710, in async_iteration
    return await anext(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 815, in asyncgen_wrapper
    response = await iterator.__anext__()
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\chat_interface.py", line 678, in _stream_fn
    first_response = await async_iteration(generator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 710, in async_iteration
    return await anext(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 704, in __anext__
    return await anyio.to_thread.run_sync(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 687, in run_sync_iterator_async
    return next(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\app.py", line 158, in chat_llama3_8b
    for text in streamer:
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\transformers\generation\streamers.py", line 223, in __next__
    value = self.text_queue.get(timeout=self.timeout)
  File "D:\Python\lib\queue.py", line 179, in get
    raise Empty
_queue.Empty

because it does not put the prompt text into the "Type a message" field?

image

If I reload the UI and manually type the prompt "Create a 3D model of a wooden hammer" into the "Type a message" field it does then start without error.

@oursland
Copy link
Contributor

There are two fields of pre-written prompts, ones above the entry box and ones below. The ones above give me an error, but the ones below seem to work.

@SoftologyPro
Copy link
Author

There are two fields of pre-written prompts, ones above the entry box and ones below. The ones above give me an error, but the ones below seem to work.

I only see the example buttons and clicked the first of those, ie

image

@SoftologyPro SoftologyPro changed the title Slow speed Slow speed on Windows Nov 19, 2024
@oursland
Copy link
Contributor

oursland commented Nov 21, 2024

Here's what I see on my machine.

image

The buttons in the upper box ("Gradio ChatInterface") do not seem to work, but the buttons below ("Examples") do.

@SoftologyPro
Copy link
Author

Here's what I see on my machine.

image

The buttons in the upper box ("Gradio ChatInterface") do not seem to work, but the buttons below ("Examples") do.

Anyway, you are on Mac and this has nothing to do with the issue I am trying to get an answer to. You should start your own issue.

@SoftologyPro
Copy link
Author

Someone posted then deleted a suggestion to try flash-attn.
Tried that. Not any faster. Any other ideas? Thanks.

@thuwzy
Copy link
Collaborator

thuwzy commented Nov 21, 2024

Are you using bf16? It's much faster than fp32

@SoftologyPro
Copy link
Author

Are you using bf16? It's much faster than fp32

How do I set that? I do not see either in app.py.

@jpschw
Copy link

jpschw commented Nov 24, 2024

Try this:
model = model.to(torch.bfloat16)

Unfortunately I get an OOM error after switching to bf16 on a 4070 ti =(

@Kingdroper
Copy link

Testing the first example prompt gives this error after clicking it

Traceback (most recent call last):
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\queueing.py", line 624, in process_events
    response = await route_utils.call_process_api(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\blocks.py", line 2015, in process_api
    result = await self.call_function(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\blocks.py", line 1574, in call_function
    prediction = await utils.async_iteration(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 710, in async_iteration
    return await anext(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 815, in asyncgen_wrapper
    response = await iterator.__anext__()
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\chat_interface.py", line 678, in _stream_fn
    first_response = await async_iteration(generator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 710, in async_iteration
    return await anext(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 704, in __anext__
    return await anyio.to_thread.run_sync(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\gradio\utils.py", line 687, in run_sync_iterator_async
    return next(iterator)
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\app.py", line 158, in chat_llama3_8b
    for text in streamer:
  File "D:\Tests\LLaMA-Mesh\LLaMa-Mesh\venv\lib\site-packages\transformers\generation\streamers.py", line 223, in __next__
    value = self.text_queue.get(timeout=self.timeout)
  File "D:\Python\lib\queue.py", line 179, in get
    raise Empty
_queue.Empty

because it does not put the prompt text into the "Type a message" field?

image

If I reload the UI and manually type the prompt "Create a 3D model of a wooden hammer" into the "Type a message" field it does then start without error.

I run on NVIDIA L20 GPUs, it does not work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants