-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Upgrade to cosmocc 3.3.6 - Remove warnings from cuda build - Fix bug in llamafile_trapping_enabled - Refactor the new vectorized expf() code - iqk_mul_mat() only needs codegen for AVX2 - Be less gung ho about the -ngl flag in README - Restore shell scriptabiilty fix for new tokenizer - Suppress divide by zero errors llama_print_timings() - Cut back on tinyBLAS CPU multiple output type kernels - Cut back NVIDIA fat binary releases to -arch=all-major - Remove GA (won't rely on slow broken irregular cloud dev tools) - Cut flash_attn_ext from release binaries (use --recompile to have it)
- Loading branch information
Showing
24 changed files
with
242 additions
and
467 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.