Return to course: Stable Diffusion – Level 2
Stable Diffusion Art
Previous Lesson
Previous
Next
Next Section
Quiz – How Stable Diffusion work 2
What mechanism allow the text prompt controls the image generation?
*
VAE
U-Net
Conditioning
Diffusion
Which part of the model converts the text prompt to a numerical representation?
*
Tokenizer
Text transformer
U-Net
VAE
What does CLIP do?
*
Denoise the image.
Convert the image from the pixel space to the latent space.
Convert the image from the latent space to the pixel space.
Convert the text prompt to the embeddings.
What is a noise schedule?
*
The number of sampling steps.
The target amount of noise at each step.
The total amount of noise added in each generation.
The noise injected by the attention layers.
What's the effect of increasing the CFG scale?
*
Higher quality images.
Requires fewer sampling steps.
A higher initial amount of noise is injected.
The images follow the prompt more closely.
What does the SDXL model consist of?
*
The base model
The base and enhancer models
The base and refiner models
The base and finisher models