Skip to content
This repository has been archived by the owner on Sep 27, 2024. It is now read-only.

"llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'command-r''" #44

Closed
Iory1998 opened this issue May 8, 2024 · 2 comments

Comments

@Iory1998
Copy link

Iory1998 commented May 8, 2024

Hi,
I hope this message finds you well.

I recently faced this persistent problem where each time I try to run a Command-R based model, I get this error message and the models just doesn't load. I try loading the model on the Oobabooga webui, and it loads just fine,

The complete error message is as follows:

"llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'command-r''"
Diagnostics info
{
"memory": {
"ram_capacity": "31.75 GB",
"ram_unused": "22.74 GB"
},
"gpu": {
"type": "NvidiaCuda",
"vram_recommended_capacity": "24.00 GB",
"vram_unused": "22.76 GB"
},
"os": {
"platform": "win32",
"version": "10.0.22631",
"supports_avx2": true
},
"app": {
"version": "0.2.22",
"downloadsDir": "D:\LM Studio\models"
},
"model": {}
}

Model I tried: https://huggingface.co/bartowski/35b-beta-long-GGUF/blob/main/35b-beta-long-Q4_K_M.gguf
https://huggingface.co/MarsupialAI/Coomand-R-35B-v1_iMatrix_GGUF/blob/main/Coomand-R-35B-v1_iQ3m.gguf
https://huggingface.co/TheDrummer/Coomand-R-35B-v1-GGUF/blob/main/Coomand-R-35B-v1-Q3_K_M.gguf

Each time I try to load these models, I get the same error.

Could you please shed some light on the issue and provide a fix?

Thank you in advance :)

@constantOut
Copy link

I've got similar error with a different model https://huggingface.co/YorkieOH10/granite-8b-code-instruct-Q8_0-GGUF

{
  "title": "Failed to load model",
  "cause": "llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'refact''",
  "errorData": {
    "n_ctx": 1500,
    "n_batch": 512,
    "n_gpu_layers": 37
  },
  "data": {
    "memory": {
      "ram_capacity": "31.70 GB",
      "ram_unused": "9.69 GB"
    },
    "gpu": {
      "type": "NvidiaCuda",
      "vram_recommended_capacity": "16.00 GB",
      "vram_unused": "14.87 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.22631",
      "supports_avx2": true
    },
    "app": {
      "version": "0.2.22",
      "downloadsDir": "C:\\Users\\acc4k\\.cache\\lm-studio\\models"
    },
    "model": {}
  }
}```

@danton721
Copy link

danton721 commented May 12, 2024

Pending support from llama.cpp - on-going discussion here:

ggerganov/llama.cpp#7116

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants