Sana checkpoint trained with SD-3 VAE #70

srikarym · 2024-12-04T23:03:12Z

Hi,
Thank you for open-sourcing your code and trained models. Could you release the Sana text2image model trained with either SD-XL or SD-3 VAE?

recoilme · 2024-12-05T11:58:49Z

Yes, DC-AE is fast, but ruined image details, we also try f64 - same result( We try realistic, anime - https://imgsli.com/MzI0MDg3

May be take a look at auraflow vae? it's opensourced as i know, comparable to flux/sd3 vae and significantly better DC-AE

Auraflow vae

DC-AE

Or may be train small model for converting from DC-AE to AuraFlow in latent space directly, what do you think, is this possible?

srikarym · 2024-12-05T16:04:40Z

Makes sense. It's surprising that Sana obtains better FID scores with this VAE, despite worse reconstruction results.

From the paper:
although AE-F8C16 exhibits the best reconstruction ability (rFID: F8C16<F16C32<F32C32), we empirically find that the generation results of F32C32 are superior

I wish they'd release checkpoints trained with other VAEs, allowing users to choose the one that works best for their specific dataset when fine-tuning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sana checkpoint trained with SD-3 VAE #70

Sana checkpoint trained with SD-3 VAE #70

srikarym commented Dec 4, 2024

recoilme commented Dec 5, 2024

srikarym commented Dec 5, 2024 •

edited

Loading

Sana checkpoint trained with SD-3 VAE #70

Sana checkpoint trained with SD-3 VAE #70

Comments

srikarym commented Dec 4, 2024

recoilme commented Dec 5, 2024

srikarym commented Dec 5, 2024 • edited Loading

srikarym commented Dec 5, 2024 •

edited

Loading