Releases: Nexesenex/croco.cpp
Croco.Cpp_FrankenFork_v1.80002_b4229
New IQ_K quants of Ikawrakow available for inference on Cuda.
- IQ2_K, IQ3_K, IQ4_K, IQ5_K and IQ6_K.
Almost no models, if any, are quantized with it and shared on HF.
But it's one step ahead.
The newer quants of IK are a bit harder to implement for me (I can't use the .c files of Llama.CPP and need to plainly integrate IK's work (in C++), so it'll take a bit longer, I learn as I do it basically.
It works on Python, I'm compiling an .exe for Pascal, Turing, and beyond right now.
Edit : I can't make an working .exe right now. I'll see what's up later.
What you can try if you don't know better :
Download the source, put the dll in the repository, install the requirements with the Install requirements.bat, then launch with Croco.Cpp_python_launch.bat
Non Cuda users, use the previous version. No IQ_K quants there yet, though.
I joined a compiled version of IK_LLAMA_CPP, with some edits of mine. Credits go to Ikawrakow.
Croco.Cpp_FrankenFork_v1.80001_b4229
The usual, plus :
- Q6_0 quants supported, included the MMQ mode in Cuda (thanks Ikawrakow)
- KV cache (Flash attention) mode K q6_0 / V q5_0 warmly recommended, very close to q8_0/q5_1 in terms of quality, and vastly superior to the previous best compromise q5_1/q5_0. (thanks Ikawrakow)
- Image generation works again (Cuda and Vulkan tested), it was broken on previous Croco versions.
Full Changelog: v1.78003_b4067...v1.80001_b4229
Most credits go to Concedo, for KoboldCPP, and to the LlamaCPP team.
Croco.Cpp_FrankenFork_v1.78003_b4067
Test release, for the adventurous minds.
Croco.Cpp_FrankenFork_v1.77009_b3972
Contextshift doesn't depend on Noshift anymore, that parameter is gone.
Just tick the button in GUI like before, or try to use --contextshift in CLI.
Full Changelog: v1.77008_b3962...v1.77009_b3962
Croco.Cpp_FrankenFork_v1.77008_b3972
Lazy release with Concedo's merge ordeal included.
Cuda backend problems with K cache non-FA are fixed, but beware of certain K non-FA / KV FA quants with Qwen 2.5 models (at least the 1.5b), see GUI's KV slider for infos.
Full Changelog: v1.77005_b3962...v1.77008_b3962
Croco.Cpp_FrankenFork_v1.77006_b3972
Test release for Cuda bugfix.
Needs more testing, feedback appreciated.
-> Your CPU, GPU, RAM, OS, and the model tested.
If you're motivated, here's my bucket list among which you can pick :
-> In full Cuda offload : does it work?
-> In full Cuda offload with lowvram argument : does it work?
-> In full Cuda offload with mmq argument : does it work?
-> In full Cuda offload with mmq AND lowvram argument : does it work?
-> In partial offload : does it work?
-> In cuda mode but without layers on the GPU : does it work?
-> Each with KV quant mode 0 (FA, non FA), 1, 9, 14 (FA), 16 and 17 (No FA).
Turing, Ampere, and Ada supported for now, Pascal and Maxwell to come later.
Croco.Cpp_FrankenFork_v1.77005_b3962
Bugfix for 1.77004, with :
- K q8_0 V F16 FA quant dropped.
- Non FA Quants dropped for now, only K q6_0 V F16 works (and it's the best anyway).
- Algo to pass the quant, FA, and no-shift fixed.
- Up KLite to 182.
Croco.Cpp_FrankenFork_v1.77004_b3962
New lazy release with :
- @ikawrakow's recent work on KV Quants integrated (new KV quant IQ4_NL to replace Q4_0, with -1% PPL if both K and V, Q6_0 close to Q8_0 in the couple K Q6_0 / V Q5_0).
-
Use the GUI to discover the new modes.
- Some of Ikawrakow's work on Cuda (on the top of some of his work for CPU inference).
- Cuda Graph caching PR of Agray3.
- Some bugfixes (aka, the bugs created by yours truly). ^^
- Note : Llava users, be careful, it might not work or simply crash.
Second release : with the help fixed.
I'll make a longer readme when motivated.
Full Changelog: v1.76005_b3906...v1.77004_b3962
Croco.Cpp_FrankenFork_v1.77002_b3934
KVQ27 (iq4_nl) doesn't work, I leave it for further testing. I left the equivalences for the deleted KV quants to the closest equal or inferior bpw, so the config files keep working as they are, until a stable KVQ cocktail of quants is chosen.
Croco.Cpp_FrankenFork_v1.76007_b3917
v1.76007_b3917