[BUG] "topk_cpu" not implemented for 'Half' #31

Lyaaaaaaaaaaaaaaa · 2024-05-08T12:59:39Z

Describe the bug
The server crashes when using float16 without CUDA.

To Reproduce
Steps to reproduce the behavior:

Run without CUDA
Load an AI with float16
Generate something
See error

Expected behavior
It should fall back to float32 to avoid crash.

UserWarning: You are calling .generate() with the input_ids being on a device type different than your model's device. input_ids is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put input_ids to the correct device by calling for example input_ids = input_ids.to('cuda') before running .generate().
warnings.warn(
"topk_cpu" not implemented for 'Half'

The text was updated successfully, but these errors were encountered:

config.py: - Added TORCH_DTYPE_SAFETY. model.py: - Updated _load_model to force (if config.TORCH_DTYPE_SAFETY is True) torch_dtype to be set to float32 if cuda isn't available. Because otherwise, it will lead to an error during generation. See #31

Lyaaaaaaaaaaaaaaa · 2024-05-10T07:48:22Z

"Fixed" by commit 6b76d8d

Lyaaaaaaaaaaaaaaa added the bug Something isn't working label May 8, 2024

Lyaaaaaaaaaaaaaaa self-assigned this May 8, 2024

Lyaaaaaaaaaaaaaaa mentioned this issue May 10, 2024

Added a fallback to change torch_dtype if cuda isn't available. #32

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] "topk_cpu" not implemented for 'Half' #31

[BUG] "topk_cpu" not implemented for 'Half' #31

Lyaaaaaaaaaaaaaaa commented May 8, 2024 •

edited

Loading

Lyaaaaaaaaaaaaaaa commented May 10, 2024 •

edited

Loading

[BUG] "topk_cpu" not implemented for 'Half' #31

[BUG] "topk_cpu" not implemented for 'Half' #31

Comments

Lyaaaaaaaaaaaaaaa commented May 8, 2024 • edited Loading

Lyaaaaaaaaaaaaaaa commented May 10, 2024 • edited Loading

Lyaaaaaaaaaaaaaaa commented May 8, 2024 •

edited

Loading

Lyaaaaaaaaaaaaaaa commented May 10, 2024 •

edited

Loading