You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I want to run Qwen/Qwen2-VL-72B-Instruct-AWQ on my local computer, currently, I have 2xrtx 3090 but it has trouble OOM. Then I see in your vision.py 's options there --max-memory option to offload on CPU. Can you please implement it also for Qwen/Qwen2-VL-72B-Instruct-AWQ
The text was updated successfully, but these errors were encountered:
I'd like to do this, and for other models also, but it's awkward and vastly complicates the loading as far as I know it. I'd like to enable this and have had it back of mind for a while.
Hello, I want to run
Qwen/Qwen2-VL-72B-Instruct-AWQ
on my local computer, currently, I have 2xrtx 3090 but it has trouble OOM. Then I see in your vision.py 's options there--max-memory
option to offload on CPU. Can you please implement it also forQwen/Qwen2-VL-72B-Instruct-AWQ
The text was updated successfully, but these errors were encountered: