Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sana checkpoint trained with SD-3 VAE #70

Open
srikarym opened this issue Dec 4, 2024 · 2 comments
Open

Sana checkpoint trained with SD-3 VAE #70

srikarym opened this issue Dec 4, 2024 · 2 comments

Comments

@srikarym
Copy link

srikarym commented Dec 4, 2024

Hi,
Thank you for open-sourcing your code and trained models. Could you release the Sana text2image model trained with either SD-XL or SD-3 VAE?

@recoilme
Copy link

recoilme commented Dec 5, 2024

Yes, DC-AE is fast, but ruined image details, we also try f64 - same result( We try realistic, anime - https://imgsli.com/MzI0MDg3

May be take a look at auraflow vae? it's opensourced as i know, comparable to flux/sd3 vae and significantly better DC-AE

Auraflow vae
_auraflow

DC-AE
_auraf64

Or may be train small model for converting from DC-AE to AuraFlow in latent space directly, what do you think, is this possible?

@srikarym
Copy link
Author

srikarym commented Dec 5, 2024

Makes sense. It's surprising that Sana obtains better FID scores with this VAE, despite worse reconstruction results.

From the paper:
although AE-F8C16 exhibits the best reconstruction ability (rFID: F8C16<F16C32<F32C32), we empirically find that the generation results of F32C32 are superior

I wish they'd release checkpoints trained with other VAEs, allowing users to choose the one that works best for their specific dataset when fine-tuning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants