Release Croco.Cpp_FrankenFork_v1.80002_b4229 · Nexesenex/croco.cpp

New IQ_K quants of Ikawrakow available for inference on Cuda.

IQ2_K, IQ3_K, IQ4_K, IQ5_K and IQ6_K.
Almost no models, if any, are quantized with it and shared on HF.
But it's one step ahead.
The newer quants of IK are a bit harder to implement for me (I can't use the .c files of Llama.CPP and need to plainly integrate IK's work (in C++), so it'll take a bit longer, I learn as I do it basically.

It works on Python, I'm compiling an .exe for Pascal, Turing, and beyond right now.

Edit : I can't make an working .exe right now. I'll see what's up later.

What you can try if you don't know better :
Download the source, put the dll in the repository, install the requirements with the Install requirements.bat, then launch with Croco.Cpp_python_launch.bat

Non Cuda users, use the previous version. No IQ_K quants there yet, though.

I joined a compiled version of IK_LLAMA_CPP, with some edits of mine. Credits go to Ikawrakow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Croco.Cpp_FrankenFork_v1.80002_b4229