Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker部署推理时间巨慢问题 #267

Open
wuxx0620 opened this issue Dec 24, 2024 · 1 comment
Open

Docker部署推理时间巨慢问题 #267

wuxx0620 opened this issue Dec 24, 2024 · 1 comment

Comments

@wuxx0620
Copy link

Downloading [config/path.yaml]: 100%|██████████| 309/309 [00:00<00:00, 981B/s]8.0MB/s]
Downloading [README.md]: 100%|██████████| 1.88k/1.88k [00:00<00:00, 5.70kB/s]30.8MB/s]B/s]
Downloading [asset/DVAE.safetensors]: 100%|██████████| 57.6M/57.6M [00:02<00:00, 28.5MB/s]/s]
Downloading [asset/tokenizer/special_tokens_map.json]: 100%|██████████| 7.66k/7.66k [00:00<00:00, 24.8kB/s]
Downloading [asset/DVAE_full.pt]: 100%|██████████| 57.6M/57.6M [00:02<00:00, 28.6MB/s]B/s]/s]
Downloading [asset/spk_stat.pt]: 100%|██████████| 4.16k/4.16k [00:00<00:00, 14.1kB/s]]B/s]/s]
Downloading [asset/tokenizer/tokenizer.json]: 100%|██████████| 438k/438k [00:00<00:00, 940kB/s]
Downloading [asset/tokenizer/tokenizer_config.json]: 100%|██████████| 10.8k/10.8k [00:00<00:00, 34.1kB/s]
Downloading [asset/tokenizer.pt]: 100%|██████████| 329k/329k [00:00<00:00, 655kB/s].6MB/s]/s]?B/s]
Downloading [config/vocos.yaml]: 100%|██████████| 460/460 [00:00<00:00, 1.46kB/s]s].1MB/s]/s]:00, 24.8kB/s]
Downloading [asset/Decoder.safetensors]: 100%|██████████| 98.9M/98.9M [00:03<00:00, 29.8MB/s]]
Downloading [asset/Vocos.pt]: 100%|██████████| 51.8M/51.8M [00:01<00:00, 33.8MB/s]9.0MB/s]/s]]
Downloading [asset/Vocos.safetensors]: 100%|██████████| 51.8M/51.8M [00:01<00:00, 31.6MB/s]s]s]
Downloading [asset/Embed.safetensors]: 100%|██████████| 139M/139M [00:04<00:00, 33.1MB/s] /s]]
Downloading [asset/Decoder.pt]: 100%|██████████| 98.9M/98.9M [00:05<00:00, 20.3MB/s]MB/s]B/s]/s]
Downloading [asset/GPT.pt]: 100%|██████████| 859M/859M [00:10<00:00, 85.4MB/s], 41.9MB/s]MB/s], 34.2kB/s]
Downloading [asset/gpt/model.safetensors]: 100%|██████████| 814M/814M [00:16<00:00, 51.6MB/s]]
Fetching 23 files: 100%|██████████| 23/23 [00:19<00:00, 1.18it/s]5<00:00, 17.6MB/s] 45.3MB/s]
2024-12-24 09:48:35,835 - modelscope - INFO - Download model 'AI-ModelScope/ChatTTS' successfully.
2024-12-24 09:48:35,856 - INFO - 当前设备类型:cuda859M [00:04<00:03, 98.7MB/s]0:14, 48.7MB/s]
2024-12-24 09:48:35,856 - INFO - try to load from local: /app/asset/AI-ModelScope/ChatTTSB/s]
2024-12-24 09:48:35,856 - INFO - checking assets... | 153M/814M [00:04<00:14, 49.0MB/s]
2024-12-24 09:48:36,938 - INFO - all assets are already latest.M/814M [00:09<00:06, 60.8MB/s]
2024-12-24 09:48:37,357 - INFO - vocos loaded.%|█████████▉| 813M/814M [00:16<00:00, 28.7MB/s]
2024-12-24 09:48:37,559 - INFO - dvae loaded.
2024-12-24 09:48:41,170 - WARNING - use default LlamaModel for importing TELlamaModel error: Command 'ldconfig -p | grep 'libnvrtc'' returned non-zero exit status 1.
2024-12-24 09:48:42,056 - INFO - gpt loaded.
2024-12-24 09:48:42,381 - INFO - decoder loaded.
2024-12-24 09:48:42,396 - INFO - tokenizer loaded.
2024-12-24 09:48:42,396 - INFO - all models has been initialized.
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2024-12-24 09:50:11,078 - INFO - input:text='12344' custom_voice=1983 voice='2222' temperature=0.3 top_p=0.7 top_k=20 skip_refine=0 speed=4 text_seed=42 refine_max_new_token=384 infer_max_new_token=2048 is_stream=0 prompt='' audio_format='wav'
2024-12-24 09:50:11,140 - INFO - all models has been initialized.
text: 2%|▏ | 9/384(max) [01:50, 12.23s/it]
code: 2%|▏ | 51/2048(max) [00:35, 1.43it/s]
/home/venv/lib/python3.9/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
return F.conv1d(input, weight, bias, self.stride,

@wuxx0620
Copy link
Author

wuxx0620 commented Dec 24, 2024

宿主机驱动版本:
NVIDIA-SMI 535.171.04 Driver Version: 535.171.04 CUDA Version: 12.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant