Releases · Nexesenex/croco.cpp

08 Dec 19:36

Nexesenex

v1.80002_b4229

fab59c8

Croco.Cpp_FrankenFork_v1.80002_b4229 Latest

Latest

New IQ_K quants of Ikawrakow available for inference on Cuda.

IQ2_K, IQ3_K, IQ4_K, IQ5_K and IQ6_K.
Almost no models, if any, are quantized with it and shared on HF.
But it's one step ahead.
The newer quants of IK are a bit harder to implement for me (I can't use the .c files of Llama.CPP and need to plainly integrate IK's work (in C++), so it'll take a bit longer, I learn as I do it basically.

It works on Python, I'm compiling an .exe for Pascal, Turing, and beyond right now.

Edit : I can't make an working .exe right now. I'll see what's up later.

What you can try if you don't know better :
Download the source, put the dll in the repository, install the requirements with the Install requirements.bat, then launch with Croco.Cpp_python_launch.bat

Non Cuda users, use the previous version. No IQ_K quants there yet, though.

I joined a compiled version of IK_LLAMA_CPP, with some edits of mine. Credits go to Ikawrakow.

Assets 4

04 Dec 18:14

Nexesenex

v1.80001_b4229

22d017e

Croco.Cpp_FrankenFork_v1.80001_b4229

The usual, plus :

Q6_0 quants supported, included the MMQ mode in Cuda (thanks Ikawrakow)
KV cache (Flash attention) mode K q6_0 / V q5_0 warmly recommended, very close to q8_0/q5_1 in terms of quality, and vastly superior to the previous best compromise q5_1/q5_0. (thanks Ikawrakow)
Image generation works again (Cuda and Vulkan tested), it was broken on previous Croco versions.

Full Changelog: v1.78003_b4067...v1.80001_b4229

Most credits go to Concedo, for KoboldCPP, and to the LlamaCPP team.

Assets 5

13 Nov 09:20

Nexesenex

v1.78003_b4067

2d0ca08

Croco.Cpp_FrankenFork_v1.78003_b4067

Test release, for the adventurous minds.

Assets 3

24 Oct 22:20

Nexesenex

v1.77009_b3962

49c358a

Croco.Cpp_FrankenFork_v1.77009_b3972

Contextshift doesn't depend on Noshift anymore, that parameter is gone.
Just tick the button in GUI like before, or try to use --contextshift in CLI.

Full Changelog: v1.77008_b3962...v1.77009_b3962

Assets 3

24 Oct 20:57

Nexesenex

v1.77008_b3962

6bac9f3

Croco.Cpp_FrankenFork_v1.77008_b3972

Lazy release with Concedo's merge ordeal included.

Cuda backend problems with K cache non-FA are fixed, but beware of certain K non-FA / KV FA quants with Qwen 2.5 models (at least the 1.5b), see GUI's KV slider for infos.

Full Changelog: v1.77005_b3962...v1.77008_b3962

Assets 3

24 Oct 16:46

Nexesenex

v.77006_b3972

9fbd023

Croco.Cpp_FrankenFork_v1.77006_b3972

Test release for Cuda bugfix.

Needs more testing, feedback appreciated.

-> Your CPU, GPU, RAM, OS, and the model tested.

If you're motivated, here's my bucket list among which you can pick :

-> In full Cuda offload : does it work?
-> In full Cuda offload with lowvram argument : does it work?
-> In full Cuda offload with mmq argument : does it work?
-> In full Cuda offload with mmq AND lowvram argument : does it work?
-> In partial offload : does it work?
-> In cuda mode but without layers on the GPU : does it work?

-> Each with KV quant mode 0 (FA, non FA), 1, 9, 14 (FA), 16 and 17 (No FA).

Turing, Ampere, and Ada supported for now, Pascal and Maxwell to come later.

Assets 3

23 Oct 07:37

Nexesenex

v1.77005_b3962

70df4d7

Croco.Cpp_FrankenFork_v1.77005_b3962 Pre-release

Pre-release

Bugfix for 1.77004, with :

K q8_0 V F16 FA quant dropped.
Non FA Quants dropped for now, only K q6_0 V F16 works (and it's the best anyway).
Algo to pass the quant, FA, and no-shift fixed.
Up KLite to 182.

Assets 6

23 Oct 00:28

Nexesenex

v1.77004_b3962

6d818bd

Croco.Cpp_FrankenFork_v1.77004_b3962 Pre-release

Pre-release

New lazy release with :

@ikawrakow's recent work on KV Quants integrated (new KV quant IQ4_NL to replace Q4_0, with -1% PPL if both K and V, Q6_0 close to Q8_0 in the couple K Q6_0 / V Q5_0).
Use the GUI to discover the new modes.
Some of Ikawrakow's work on Cuda (on the top of some of his work for CPU inference).
Cuda Graph caching PR of Agray3.
Some bugfixes (aka, the bugs created by yours truly). ^^
Note : Llava users, be careful, it might not work or simply crash.

Second release : with the help fixed.

I'll make a longer readme when motivated.

Full Changelog: v1.76005_b3906...v1.77004_b3962

Contributors

ikawrakow

Assets 5

20 Oct 04:15

Nexesenex

v1.77002_b3934

df98d3f

Croco.Cpp_FrankenFork_v1.77002_b3934 Pre-release

Pre-release

KVQ27 (iq4_nl) doesn't work, I leave it for further testing.
I left the equivalences for the deleted KV quants to the closest equal or inferior bpw, so the config files keep working as they are, until a stable KVQ cocktail of quants is chosen.

Assets 4

14 Oct 22:29

Nexesenex

v1.76007_b3917

4cdccce

Croco.Cpp_FrankenFork_v1.76007_b3917

v1.76007_b3917

Assets 5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Releases: Nexesenex/croco.cpp

Croco.Cpp_FrankenFork_v1.80002_b4229

Croco.Cpp_FrankenFork_v1.80001_b4229

Croco.Cpp_FrankenFork_v1.78003_b4067

Croco.Cpp_FrankenFork_v1.77009_b3972

Croco.Cpp_FrankenFork_v1.77008_b3972

Croco.Cpp_FrankenFork_v1.77006_b3972

Croco.Cpp_FrankenFork_v1.77005_b3962

Croco.Cpp_FrankenFork_v1.77004_b3962

Contributors

Croco.Cpp_FrankenFork_v1.77002_b3934

Croco.Cpp_FrankenFork_v1.76007_b3917