Skip to content

Croco.Cpp_FrankenFork_v1.80002_b4229

Latest
Compare
Choose a tag to compare
@Nexesenex Nexesenex released this 08 Dec 19:36
· 181 commits to croco_exp_0 since this release

New IQ_K quants of Ikawrakow available for inference on Cuda.

  • IQ2_K, IQ3_K, IQ4_K, IQ5_K and IQ6_K.
    Almost no models, if any, are quantized with it and shared on HF.
    But it's one step ahead.
    The newer quants of IK are a bit harder to implement for me (I can't use the .c files of Llama.CPP and need to plainly integrate IK's work (in C++), so it'll take a bit longer, I learn as I do it basically.

It works on Python, I'm compiling an .exe for Pascal, Turing, and beyond right now.

Edit : I can't make an working .exe right now. I'll see what's up later.

What you can try if you don't know better :
Download the source, put the dll in the repository, install the requirements with the Install requirements.bat, then launch with Croco.Cpp_python_launch.bat

Non Cuda users, use the previous version. No IQ_K quants there yet, though.

I joined a compiled version of IK_LLAMA_CPP, with some edits of mine. Credits go to Ikawrakow.