[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

bWm-nubby · 2024-12-24T01:11:28Z

Is there an existing issue for this problem?

I have searched the existing issues

Operating system

Windows

GPU vendor

Nvidia (CUDA)

GPU model

RTX 3080 12GB

GPU VRAM

12GB

Version number

5.5

Browser

Invoke Community Edition v5.5 launcher / MS Edge 131.0.2903.99 (Official build) (64-bit)

Python dependencies

accelerate==1.0.1

compel==2.0.2

cuda==12.4

diffusers==0.31.0

numpy==1.26.3

opencv==4.9.0.80

onnx==1.16.1

pillow==10.2.0

python==3.11.11

torch==2.4.1+cu124

torchvision==0.19.1+cu124

transformers==4.46.3

xformers==Not Installed

What happened

SDXL based v_prediction models are assigned epsilon prediction type automatically even when vpred and zsnr state_dict keys are present in the model, and manually changing prediction type to v_prediction is not respected. This results in unusably noisy outputs from these models.

What you expected to happen

v_prediction based models should have the correct prediction type detected based on the state_dict keys within the model metadata. In the event that this fails due to missing keys or any other reason, the user's manually selected prediction type under model settings should be respected resulting in normal quality outputs.

How to reproduce the problem

Download the V-Pred-1.0-Version of this model noobai-xl-nai-xl
Add the model through Invoke-AI's model management ui
Model will be detected as an epsilon prediction model
Change prediction type manually to v_prediction
Generate image with the downloaded model
The output will be extremely noisy and low quality

Additional context

I also attempted converting the model to Diffusers format both before and after manually setting the prediction type with no change in results. Additionally, the option to enable zsnr does not seem to exist in Invoke-AI though that seems to be a missing feature rather than a bug.

Discord username

bwm_nubby

bWm-nubby added the bug Something isn't working label Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

bWm-nubby commented Dec 24, 2024

[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

[bug]: SDXL based V-pred models are treated as epsilon prediction - result noisy images #7495

Comments

bWm-nubby commented Dec 24, 2024

Is there an existing issue for this problem?

Operating system

GPU vendor

GPU model

GPU VRAM

Version number

Browser

Python dependencies

What happened

What you expected to happen

How to reproduce the problem

Additional context

Discord username