Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatenate the conditional and unconditional inputs to speed inference #4

Open
jakob1519 opened this issue Jan 6, 2023 · 4 comments

Comments

@jakob1519
Copy link

hello,

I want to ask this code in diffuser.py
why it can speed inference?
could you explain it to me?

nn_inputs = [np.vstack([x_t, x_t]),
                     np.vstack([noise_in, noise_in]),
                     np.vstack([label, label_empty_ohe])]
@apapiu
Copy link
Owner

apapiu commented Jan 6, 2023

Hey! The speedup happens in the next line: x0_pred = self.denoiser.predict(nn_inputs, batch_size=self.batch_size). Here we only have to call .predict once on the concatenated matrix which is faster than calling .predict twice on conditional and unconditional inputs.

@jakob1519
Copy link
Author

Thank you! I got it!

And this part of the code in diffuser.py
I didn't know what this part did

What is the difference between x0_pred_label and x0_pred_no_label?

# classifier free guidance:
x0_pred = self.class_guidance * x0_pred_label + (1 - self.class_guidance) * x0_pred_no_label

if self.perc_thresholding:
    # clip the prediction using dynamic thresholding a la Imagen:
    x0_pred = dynamic_thresholding(x0_pred, perc=self.perc_thresholding)

@apapiu
Copy link
Owner

apapiu commented Jan 8, 2023

x0_pred_label is the prediction conditioned on the text embedding and x0_pred_no_label is the unconditional prediction (where the text embedding input is 0).

@jakob1519
Copy link
Author

Got it! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants