Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool-call: fix type promotion typo causing crashes w/ --jinja w/o tools #11880

Merged
merged 1 commit into from
Feb 15, 2025

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented Feb 15, 2025

Fixes (at least part of) #11866

Somehow a bool can be assigned to an std::string 🤯: new fear unlocked 😅

This introduced different levels of soft or hard crashes as the grammar string got a true / (byte 1) then... something random / platform specific.

Seems different compilers have different levels of letting this through:

bug.cc

// g++ -Wall -Wextra -Wpedantic bug.cc -o out
#include <string>
int main() {
  std::string s;
  // s = false; // this is caught by Apple clang version 15.0.0 but not GCC 14
  s = s.empty(); // caught by neither, even w/ -Wall -Wextra -Wpedantic 
}

@ochafik ochafik changed the title tool-call: fix no-tools, jinja case tool-call: fix --jinja w/o tools case Feb 15, 2025
@ochafik ochafik marked this pull request as ready for review February 15, 2025 01:01
@ochafik ochafik requested a review from ggerganov February 15, 2025 01:05
@ochafik ochafik changed the title tool-call: fix --jinja w/o tools case tool-call: fix --jinja w/o tools (type promotion typo caused crashes) Feb 15, 2025
@henryclw
Copy link

Now the fear unlocked for me as well. Couldn't assume my compiler would catch all the type error.

@MoonRide303
Copy link
Contributor

@ochafik eafd957 helped for Gemma 2 (parse: error parsing grammar: expecting name at message is gone), but both Llama 3.2 and also Qwen2.5 now cannot start when trying to load original jinja template from file. No errors, both just silently quit after printing device info:

llama-server.exe -ngl 99 -m Qwen2.5-1.5B-Instruct-Q8_0.gguf --jinja --chat-template-file qwen2.5.jinja -c 8192
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4080, compute capability 8.9, VMM: yes

In case of Gemma I also noticed "srv params_from_: Chat format: Content-only" message in the server log, which wasn't printed when I didn't use jinja template.

@ggerganov
Copy link
Member

clang-tidy detects these kind of errors:

image

Overall the chat.cpp could use some tidy-ing, as it currently "lights up" in my editor and it's difficult to filter the relevant warnings from the less-relevant:

image

@ochafik ochafik changed the title tool-call: fix --jinja w/o tools (type promotion typo caused crashes) tool-call: fix type promotion typo causing crashes w/ --jinja w/o tools Feb 15, 2025
@ochafik
Copy link
Collaborator Author

ochafik commented Feb 15, 2025

clang-tidy detects these kind of errors:
...
Overall the chat.cpp could use some tidy-ing, as it currently "lights up" in my editor and it's difficult to filter the relevant warnings from the less-relevant:
...

I'll do a pass separately / see whether we can run checks in ci. In some of the cases above, clang-tidy and -Wunused-parameter disagree on whether an unused parameter should have a name (will try and use // NOLINT sparingly :-))

@ochafik
Copy link
Collaborator Author

ochafik commented Feb 15, 2025

@ochafik eafd957 helped for Gemma 2 (parse: error parsing grammar: expecting name at message is gone), but both Llama 3.2 and also Qwen2.5 now cannot start when trying to load original jinja template from file. No errors, both just silently quit after printing device info:

llama-server.exe -ngl 99 -m Qwen2.5-1.5B-Instruct-Q8_0.gguf --jinja --chat-template-file qwen2.5.jinja -c 8192
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4080, compute capability 8.9, VMM: yes

@henryclw Thanks for checking! I can't repro on Mac, will spin up a Windows VM & track this on #11866

In case of Gemma I also noticed "srv params_from_: Chat format: Content-only" message in the server log, which wasn't printed when I didn't use jinja template.

This is expected 👍

@ochafik ochafik merged commit f355229 into ggml-org:master Feb 15, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants