Partial fix phi3 device mapping #1002

cdoko · 2024-12-24T09:32:11Z

I've made a partial fix for device mapping issues on Phi3. Previously, device mapping didn't work across various models, including Phi2, Phi3, Mistral, and Llama.

The fix involves moving tensors needed to operate together to the same device. I've chosen the device where the cache is, assuming that moving the cache might be slower. This change allows Phi3 to be loaded across devices, and I've tested it with 2 GPUs and 1 GPU + 1 CPU.

The fix resolves the issue partially for Phi3, but other models still encounter a CUDA_ERROR_ILLEGAL_ADDRESS error that prevents them from loading successfully. In contrast, Phi3 can now be loaded without issues.

The CUDA_ERROR_ILLEGAL_ADDRESS error occurs in different scenarios for each model. For example, in the Mistral model, calling contiguous() on a tensor causes this error, and moving a tensor across devices also triggers it. I found it unusual that Phi3 is the only model that works with this fix, and certain operations like contiguous() work fine on Phi3 but not on other models.

However, there's still a broken aspect: sending a second request with the same prompt results in gibberish output. Notably, this behavior is currently equivalent to running with --no-paged-attn (using only 1 device), so the issue is not introduced by this fix. I suspect it's a bug in the cache manager. PA, on the other hand, does not have this issue.

I appreciate any feedback you can provide. I look forward to your review!

github-actions · 2024-12-24T09:33:15Z

Code Metrics Report

  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           41           22           10            9
 JSON                   12          105          104            0            1
 Python                 63         2706         2338           71          297
 Shell                   1           57           22           18           17
 Plain Text              3         3723            0         2413         1310
 TOML                   18          603          536            2           65
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          205          178            1           26
 (Total)                            282          210           32           40
-------------------------------------------------------------------------------
 Markdown               43         3324            0         2520          804
 |- BASH                 6          101           98            0            3
 |- JSON                 1           12           12            0            0
 |- Python               7          121          109            0           12
 |- Rust                12          406          344            0           62
 |- TOML                 2           75           63            0           12
 (Total)                           4039          626         2520          893
-------------------------------------------------------------------------------
 Rust                  289        87939        78969         1804         7166
 |- Markdown           139         1534           25         1395          114
 (Total)                          89473        78994         3199         7280
===============================================================================
 Total                 438        98554        82038         6840         9676
===============================================================================

cdoko added 2 commits December 24, 2024 04:53

move tensors to the same device

377b109

move tensors to the same device

aab4285

cdoko added 2 commits December 24, 2024 05:38

formatting

ce02cdd

Update cache_manager.rs

64e7ac7

cdoko closed this Dec 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial fix phi3 device mapping #1002

Partial fix phi3 device mapping #1002

cdoko commented Dec 24, 2024

github-actions bot commented Dec 24, 2024 •

edited

Loading

Partial fix phi3 device mapping #1002

Partial fix phi3 device mapping #1002

Conversation

cdoko commented Dec 24, 2024

github-actions bot commented Dec 24, 2024 • edited Loading

github-actions bot commented Dec 24, 2024 •

edited

Loading