Supporting for Ternary DiT #470

Lucky-Lance · 2024-11-20T07:52:07Z

Hi,

Ternary quantization has become popular and has demonstrated computational speedups and power reductions, as demonstrated in works like llama.cpp and bitnet.cpp. We trained the first ternary DiT network, DiT is a popular structure nowadays for text to image generation. We would like to know if we can be assisted in realizing the deployment of it on stable-diffusion.cpp.

We asked llama.cpp for help and they advised me to come here for guidance link.

stduhpf · 2024-11-20T11:38:21Z

I think just updating the ggml submodule to a more recent version should be most of the work.

Lucky-Lance · 2024-11-20T14:15:58Z

Thank you for your suggestion. Updating the ggml submodule to a more recent version sounds like a good starting point. However, I must admit that I have really limited experience with writing kernel codes😵.

Green-Sky · 2024-11-20T14:23:35Z

We trained the first ternary DiT network

There has been one for a while that uses a categorical classifier. Do you mean embedding based?

Here: #331

Edit: oh, its you. hahah

Lucky-Lance · 2024-11-20T14:36:38Z

😇👀

Green-Sky · 2024-11-20T15:01:38Z

@stduhpf I will try to make a pr to update to latest, or newer ggml. We can then try to do some stuff based on that.

@Lucky-Lance Why did you user Lables and not Embedding(s) for the classifier? This makes its somewhat unusable for text-to-image.
I love your work however <3 .

Are there any plans to "distil" something like flux schnell, so training a new TerDiT on the outputs?
Or embedding based ... ?

Lucky-Lance · 2024-11-20T15:10:44Z

Label-based generation was just an attempt I made previously. In fact, I've always wanted to work on a text-to-image model, but the actual deployment only resulted in reduced memory usage without improving inference speed. This has made me less confident about further pursuing text-to-image models. If I receive support, I would certainly train a text-to-image model afterwards.

Thanks a lot for your support 🤩🥳.

Lucky-Lance · 2024-11-22T08:00:08Z

I noticed you're facing some problems while upgrading ggml. :( Just checking in to see if you're still planning to support it, and if so, can it be completed within one or two months..?

Green-Sky · 2024-11-22T10:23:07Z

Well, it all depends on the individuals motivation and time, so no promises. 😅

That being said, after updating ggml, I did a test, where i quantize flux to tq1_0/tq2_0 (5w/byte and 4w/byte) and it runs. On cpu only. And produces noise. So it might or might not work.

I will probably continue updating ggml and adopting code changes to sd.cpp, before trying any architectural stuff.
Maybe @stduhpf wants to take a stab at it, while I do that?

Green-Sky · 2024-11-22T10:45:04Z

This is what flux schnell with tq1_0/tq2_0 looks like:

(both are identical, which is a good sign)

Lucky-Lance · 2024-11-22T10:45:04Z

Oh, truly grateful for your efforts! 😆 Hoping everything goes smoothly.

Green-Sky · 2024-11-22T10:48:40Z

Link to the "quantization" pr in llama.cpp that added tq1/2 ggerganov/llama.cpp#8151

Green-Sky · 2024-11-22T11:04:29Z

Another thing, that I leave to the future is looking into ik's fork with better bitnet support https://github.com/ikawrakow/ik_llama.cpp

Lucky-Lance · 2024-12-20T04:49:09Z

Hi, a month has slipped away, and I was wondering if the support is still part of the plan 😌

stduhpf · 2024-12-20T15:46:42Z

Ternary data types are now supported. Which means that in theory, any model with the same overall architecture as a supported model like SD3 or Flux, but trained in ternary, would work.
If the architecture is different, then more work is required.

Green-Sky · 2024-12-20T19:43:41Z

Haven't had time to work on sd.cpp this month, sorry.

Yea the bitnets have extra normalization layers in places.
So the real pain point here is the lack of a, ideally already implemented, text embedding, instead of labels.

Lucky-Lance · 2024-12-21T00:47:26Z

OK I will give it a try 😆

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting for Ternary DiT #470

Supporting for Ternary DiT #470

Lucky-Lance commented Nov 20, 2024

stduhpf commented Nov 20, 2024

Lucky-Lance commented Nov 20, 2024

Green-Sky commented Nov 20, 2024 •

edited

Loading

Lucky-Lance commented Nov 20, 2024

Green-Sky commented Nov 20, 2024

Lucky-Lance commented Nov 20, 2024

Lucky-Lance commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Lucky-Lance commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Lucky-Lance commented Dec 20, 2024

stduhpf commented Dec 20, 2024

Green-Sky commented Dec 20, 2024

Lucky-Lance commented Dec 21, 2024

Supporting for Ternary DiT #470

Supporting for Ternary DiT #470

Comments

Lucky-Lance commented Nov 20, 2024

stduhpf commented Nov 20, 2024

Lucky-Lance commented Nov 20, 2024

Green-Sky commented Nov 20, 2024 • edited Loading

Lucky-Lance commented Nov 20, 2024

Green-Sky commented Nov 20, 2024

Lucky-Lance commented Nov 20, 2024

Lucky-Lance commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Lucky-Lance commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Green-Sky commented Nov 22, 2024

Lucky-Lance commented Dec 20, 2024

stduhpf commented Dec 20, 2024

Green-Sky commented Dec 20, 2024

Lucky-Lance commented Dec 21, 2024

Green-Sky commented Nov 20, 2024 •

edited

Loading