Quiz – text-to-image workflow in ComfyUI

Drag and Drop the items into the correct order.

The CLIP model encodes the text prompts into embeddings
The VAE Decoder converts the latent image into the pixel image
The random latent image is denoised, conditioned by the prompt
Load the checkpoint model
The image is saved to your local storage