Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] #945

Open
2 of 6 tasks
thiagoldaniel opened this issue Nov 24, 2024 · 1 comment
Open
2 of 6 tasks

[Bug] #945

thiagoldaniel opened this issue Nov 24, 2024 · 1 comment
Assignees
Labels

Comments

@thiagoldaniel
Copy link

Priority

P1-Stopper

OS type

Ubuntu

Hardware type

Xeon-other (Please let us know in description)

Installation method

  • Pull docker images from hub.docker.com
  • Build docker images from source

Deploy method

  • Docker compose
  • Docker
  • Kubernetes
  • Helm

Running nodes

Single Node

What's the version?

When running the container I receive this error message when validating the service.
2024-11-24T02:42:50.232633Z INFO download: text_generation_launcher: Starting check and download process for Intel/neural-chat-7b-v3-3
2024-11-24T02:43:02.184459Z ERROR download: text_generation_launcher: Download process was signaled to shutdown with signal 4:

Description

Error on start Docker

Reproduce steps

https://opea-project.github.io/latest/getting-started/README.html

Raw log

2024-11-24T02:42:33.812733Z  INFO hf_hub: Token file not found "/root/.cache/huggingface/token"
2024-11-24T02:42:33.831721Z  INFO text_generation_launcher: Model supports up to 32768 but tgi will now set its default to 4096 instead. This is to save VRAM by refusing large prompts in order to allow more users on the same hardware. You can increase that size using `--max-batch-prefill-tokens=32818 --max-total-tokens=32768 --max-input-tokens=32767`.
2024-11-24T02:42:50.231954Z  WARN text_generation_launcher::gpu: Cannot determine GPU compute capability: AssertionError: Torch not compiled with CUDA enabled
2024-11-24T02:42:50.232133Z  INFO text_generation_launcher: Using attention paged - Prefix caching 0
2024-11-24T02:42:50.232276Z  INFO text_generation_launcher: Default `max_input_tokens` to 4095
2024-11-24T02:42:50.232332Z  INFO text_generation_launcher: Default `max_total_tokens` to 4096
2024-11-24T02:42:50.232360Z  INFO text_generation_launcher: Default `max_batch_prefill_tokens` to 4145
2024-11-24T02:42:50.232633Z  INFO download: text_generation_launcher: Starting check and download process for Intel/neural-chat-7b-v3-3
2024-11-24T02:43:02.184459Z ERROR download: text_generation_launcher: Download process was signaled to shutdown with signal 4:
Error: DownloadError

Attachments

No response

@wangkl2 wangkl2 self-assigned this Nov 25, 2024
@wangkl2 wangkl2 added the aitce label Nov 25, 2024
@wangkl2
Copy link
Collaborator

wangkl2 commented Nov 25, 2024

@thiagoldaniel I cannot reproduce this issue on my end. May I ask which Xeon product/SKU are you using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants