Replies: 4 comments 3 replies
-
author of this post is @Beinsezii |
Beta Was this translation helpful? Give feedback.
-
cc @pcuenca here too! I think he initially added |
Beta Was this translation helpful? Give feedback.
-
The first good step, to me, seems to be documentation refactoring. I believe that will clear out a lot of confusion. |
Beta Was this translation helpful? Give feedback.
-
Is I could just |
Beta Was this translation helpful? Give feedback.
-
1.
The official SAI config for SDXL has
set_alpha_to_one: False
despite using EulerDiscreteSchedulerSo if you inherit the config such as
It'll inherit the
set_alpha_to_one=False
value which is normally enabled by default in Diffusers.2.
The actual diffusers documentation incorrectly specifies an
set_alpha_to_one
value for Euler, DDPM...And probably more, as the documentation for the
step_offset
kwarg is blindly copy-pasted across all the schedulers that have the option, explicitly stating thatHowever, a cursory glance shows that this basically only applies to DDIM and maybe a few others. Euler, DDPM, DPM Multistep, etc. all do not contain a
set_alpha_to_one
kwarg so the value is silently dropped until someone inherits the configuration later.3.
Why recommend
steps_offset=1
andset_alpha_to_one=False
?The documentation implies that
step_offset=1
andset_alpha_to_one=True
are contradictory solutions to a final timestep problem, however (on SDXL) it seems to not be the case at all.steps_offset=1
changes the image very slightly andset_alpha_to_one=False
just adds residual noise regardless of the offset as the final cumprod is clamped down to the maximum index which is something like 0.997 instead of 1.0Additionally,
steps_offset=1
doesn't even apply to thetrailing
timestep spacing that most of the major UIs use nowadays, including ComfyUI which is more or less SAI's reference implementation based on their usage in Discord. Withleading
the difference is still so small I'm wondering what the point even is. Maybe it's only useful for the older SD pipelines?The following figure of SDXL-Base images contains two rows:
leading
andtrailing
timesteps with the following columnsDDIMScheduler(steps_offset=0, set_alpha_to_one=False)
DDIMScheduler(steps_offset=1, set_alpha_to_one=False)
DDIMScheduler(steps_offset=0, set_alpha_to_one=True)
DDIMScheduler(steps_offset=1, set_alpha_to_one=True)
EulerDiscreteScheduler(steps_offset=0)
EulerDiscreteScheduler(steps_offset=1)
It's extremely obvious that the issue plaguing @bghira and myself is the
set_alpha_to_one=False
being inherited from the XL base config. Additionally, I'm not convinced thatsteps_offset=1
is ever exclusively preferable as a replacement forset_alpha_to_one=True
(or even at all, really), as it performs identically regardless of whether the final cumrpod is overridden or not.How did we get here?
My theory based on the above is that SAI initially opted to use
DDIMScheduler
for SDXL on Huggingface, as that's the scheduler in their paper. They setsteps_offset=1
andset_alpha_to_one=False
per recommendation from the Diffusers documentation as that's the setup "like in Stable Diffusion", but after dealing with horribly noisy images they just changed the scheduler toEulerDiscreteScheduler
in their config and left the rest as-is because it seemed to work (and the documentation impliesset_alpha_to_one=False
should still be used on Euler despite not existing), so now theset_alpha_to_one=False
left over in their scheduler config will pollute downstream uses that inherit from the XL base config.So, how do we fix? To be honest, I'm not 100% sure.
Recommend
timestep_spacing="trailing"
instead ofsteps_offset=1, set_alpha_to_one=False
?Maybe @patrickvonplaten or @yiyixuxu have a good solution?
No matter what happens with
set_alpha_to_one
andsteps_offset
, the documentation on all the schedulers needs some refactoring. People shouldn't be setting kwargs that don't exist.Beta Was this translation helpful? Give feedback.
All reactions