Beginner’s Guide to ComfyUI

Updated Categorized as Tutorial Tagged , 27 Comments on Beginner’s Guide to ComfyUI
ComfyUI.
What you would look like after using ComfyUI for real.

ComfyUI is a node-based GUI for Stable Diffusion. This tutorial is for someone who hasn’t used ComfyUI before. I will covers

  • Text-to-image
  • Image-to-image
  • SDXL workflow
  • Inpainting
  • Using LoRAs
  • ComfyUI Manager – managing custom nodes in GUI.
  • Impact Pack – a collection of useful ComfyUI nodes.

Table of Contents

What is ComfyUI?

ComfyUI is a node-based GUI for Stable Diffusion. You can construct an image generation workflow by chaining different blocks (called nodes) together.

Some commonly used blocks are Loading a Checkpoint Model, entering a prompt, specifying a sampler, etc. ComfyUI breaks down a workflow into rearrangeable elements so you can easily make your own.

Installing ComfyUI

You will need a working ComfyUI to follow this guide.

See the installation guide for installing locally on Mac or Windows.

You can also use my ComfyUI notebook on Google Colab for on-demand hardware.

Check out Think Diffusion for a fully managed ComfyUI online service. They offer 20% extra credits to our readers. (and a small commission to support this site if you sign up)

ComfyUI vs AUTOMATIC1111

AUTOMATIC1111 is the de facto GUI for Stable Diffusion.

Should you use ComfyUI instead of AUTOMATIC1111? Here’s a comparison.

The benefits of using ComfyUI are:

  1. Lightweight: it runs fast.
  2. Flexible: very configurable.
  3. Transparent: The data flow is in front of you.
  4. Easy to share: Each file is a reproducible workflow.
  5. Good for prototyping: Prototyping with a graphic interface instead of coding.

The drawbacks of using ComfyUI are:

  1. Inconsistent interface: Each workflow may place the nodes differently. You need to figure out what to change.
  2. Too much detail: Average users don’t need to know how things are wired under the hood. (Isn’t it the whole point of using a GUI?)

Where to start?

The best way to learn ComfyUI is by going through examples. So, we will learn how to do things in ComfyUI in the simplest text-to-image workflow.

We will go through some basic workflow examples. After studying some essential ones, you will start to understand how to make your own.

At the end of this tutorial, you will have the opportunity to make a pretty involved one. The answer will be provided.

Basic controls

Use the mouse wheel or two-finger pinch to zoom in and out.

Drag and hold the dot of the input or output to form a connection. You can only connect between input and output of the same type.

Hold and drag with the left click to move around the workspace.

Press Ctrl-0 (Windows) or Cmd-0 (Mac) to show the Queue panel.

Text-to-image

Let’s first go through the simplest case: generating an image from text.

Classical, right?

By going through this example, you will also learn the idea before ComfyUI (It’s very different from Automatic1111 WebUI). As a bonus, you will know more about how Stable Diffusion works!

Generating your first image on ComfyUI

After starting ComfyUI for the very first time, you should see the default text-to-image workflow. It should look like this:

If this is not what you see, click Load Default on the right panel to return this default text-to-image workflow.

If you don’t see the right panel, press Ctrl-0 (Windows) or Cmd-0 (Mac).

You will see the workflow is made with two basic building blocks: Nodes and edges.

Nodes are the rectangular blocks, e.g., Load Checkpoint, Clip Text Encoder, etc. Each node executes some code. If you have some programming experience, you can think of them as functions. Each node needs three things

  • Inputs are the texts and dots on the left that the wires come in.
  • Outputs are the texts and dots on the right the wires go out.
  • Parameters are the fields at the center of the block.

Edges are the wires connecting the outputs and the inputs between nodes.

That’s the whole idea! The rest are details.

Don’t worry if the jargon on the nodes looks daunting. We will walk through a simple example of using ComfyUI, introduce some concepts, and gradually move on to more complicated workflows.

Below is the simplest way you can use ComfyUI. You should be in the default workflow.

1. Selecting a model

First, select a Stable Diffusion Checkpoint model in the Load Checkpoint node. Click on the model name to show a list of available models.

If the node is too small, you can use the mouse wheel or pinch with two fingers on the touchpad to zoom in and out.

If clicking the model name does nothing, you may not have installed a model or configured it to use your existing models in A1111. Go back to the installation guide to fix it first.

2. Enter a prompt and a negative prompt

You should see two nodes labeled CLIP Text Encode (Prompt). Enter your prompt in the top one and your negative prompt in the bottom one.

The CLIP Text Enode node first converts the prompt into tokens and then encodes them into embeddings with the text encoder.

You can use the syntax (keyword:weight) to control the weight of the keyword. E.g. (keyword:1.2) to increase its effect. (keyword:0.8) to decrease its effect.

Why is the top one the prompt? Look at the CONDITIONING output. It is connected to the positive input of the KSampler node. The bottom one is connected to the negative, so it is for the negative prompt.

3. Generate an image

Click Queue Prompt to run the workflow. After a short wait, you should see the first image generated.

What has just happened?

The advantage of using ComfyUI is that it is very configurable. It is worth learning what each node does so you can use them to suit your needs.

You can skip the rest of this section if you are not interested in the theory.

Load Checkpoint node

Use the Load Checkpoint node to select a model. A Stable Diffusion model has three main parts:

  1. MODEL: The noise predictor model in the latent space.
  2. CLIP: The language model preprocesses the positive and the negative prompts.
  3. VAE: The Variational AutoEncoder converts the image between the pixel and the latent spaces.

The MODEL output connects to the sampler, where the reverse diffusion process is done.

The CLIP output connects to the prompts because the prompts need to be processed by the CLIP model before they are useful.

In text-to-image, VAE is only used in the last step: Converting the image from the latent to the pixel space. In other words, we are only using the decoder part of the autoencoder.

CLIP Text Encode

The CLIP text encode node gets the prompt and feeds it into the CLIP language model. CLIP is OpenAI’s language model, transforming each word in a prompt into embeddings.

Empty latent image

A text-to-image process starts with a random image in the latent space.

The size of the latent image is proportional to the actual image in the pixel space. So, if you want to change the size of the image, you change the size of the latent image.

You set the height and the width to change the image size in pixel space.

Here, you can also set the batch size, which is how many images you generate in each run.

KSampler

KSampler is at the heart of image generation in Stable Diffusion. A sampler denoises a random image into one that matches your prompt.

KSampler refers to samplers implemented in this code repository.

Here are the parameters in the KSampler node.

  • Seed: The random seed value controls the initial noise of the latent image and, hence, the composition of the final image.
  • Control_after_generation: How the seed should change after each generation. It can either be getting a random value (randomize), increasing by 1 (increment), decreasing by 1 (decrement), or unchanged (fixed).
  • Step: Number of sampling steps. The higher, the fewer artifacts in the numerical process.
  • Sampler_name: Here, you can set the sampling algorithm. Read the sampler article for a primer.
  • Scheduler: Controls how the noise level should change in each step.
  • Denoise: How much of the initial noise should be erased by the denoising process. 1 means all.

Image-to-image workflow

The Img2img workflow is another staple workflow in Stable Diffusion. It generates an image based on the prompt AND an input image.

You can adjust the denoising strength to control how much Stable Diffusion should follow the base image.

Download the image-to-image workflow

Drag and drop this workflow image to ComfyUI to load.

comfyUI img2img workflow.

To use this img2img workflow:

  1. Select the checkpoint model.
  2. Revise the positive and the negative prompts.
  3. Optionally adjust the denoise (denoising strength) in the KSampler node.
  4. Press Queue Prompt to start generation.

ComfyUI Manager

ComfyUI manager is a custom node that lets you install and update other custom nodes through the ComfyUI interface.

Install ComfyUI Manager on Windows

In the File Explorer App, navigate to the folder ComfyUI_windows_portable > ComfyUI > custom_nodes.

In the address bar, type cmd and press Enter.

A command prompt terminal should come up.

Type the following command and press Enter.

git clone https://github.com/ltdrdata/ComfyUI-Manager

Wait for it to complete.

Restart ComfyUI. You should see the new Manager button on the floating panel.

Tip: If ComfyUI manager doesn’t show up, read the error message in the terminal.

A common error mode is GIT (A source code management system) not installed in your system. Installing it should resolve the issue.

Install ComfyUI Manager on Mac

To install ComfyUI Manager, go to the custom nodes folder Terminal (Mac) App:

cd ComfyUI/custom_nodes

And clone the node to your local storage.

git clone https://github.com/ltdrdata/ComfyUI-Manager

Restart ComfyUI completely.

Using ComfyUI Manager

After the installation, you should see an extra Manager button on the Queue Prompt menu.

If you don’t see

Clicking it shows a GUI that lets you

The Install Missing Nodes function is especially useful for finding what custom nodes that are required in the current workflow.

The Install Custom Nodes menu lets you manage custom nodes. You can uninstall or disable an installed node or install a new one.

ComfyUI manager.

How to install missing custom nodes

You may not have installed all the custom nodes you need in a workflow. After loading the workflow file, perform the following steps to install missing custom nodes.

  1. Click Manager in the Menu.
  2. Click Install Missing custom Nodes.
  3. Restart ComfyUI completely.

ComfyUI Update All

The simplest way to update ComfyUI is to click the Update All button in ComfyUI manager. It will update ComfyUI itself and all custom nodes installed. Restart ComfyUI to complete the update.

Follow the following update steps if you want to update ComfyUI or the custom nodes independently.

How to update ComfyUI

To update ComfyUI:

  1. Click Manager in the Menu.
  2. Click Update ComfyUI.
  3. Restart ComfyUI completely.

How to update custom nodes

You can use ComfyUI manager to update custom nodes.

  1. Click Manager in the Menu.
  2. Click Fetch Updates. It may take a while to be done.
  3. Click Install Custom nodes.
  4. If an update is available, a new Update button will appear next to an installed custom node.
  5. Click Update to update the node.

6. Restart ComfyUI.

If this update process doesn’t work, you will need to use a terminal such as the PowerShell app (Windows) or the Terminal app (Mac) to do a git pull.

  1. Open the terminal.
  2. cd to the custom node’s directory. Below is an example of going into the controlnet aux directory.
 cd ComfyUI/custom_nodes/comfyui_controlnet_aux/

3. Do a git pull.

git pull

4. Restart ComfyUI.

Search custom nodes

The Add Node menu may not be the best way to find a custom node. Things only get harder after you install many custom nodes.

You can double-click any empty area to bring up a menu to search for nodes.

Upscaling

There are several ways to upscale in Stable Diffusion. For teaching purposes, let’s go through upscaling with

  1. an AI upscaler
  2. Hi-res fix
  3. Ultimate Upscale

AI upscale

An AI upscaler is an AI model for enlarging images while filling in details. They are not Stable Diffusion models but neural networks trained for enlarging images.

Load this upscaling workflow by first downloading the image on the page. Drag and drop the image to ComfyUI.

Tip: Dragging and dropping an image made with ComfyUI loads the workflow that produces it.

AI upscaler in the upscaling workflow in ComfyUI.

In this basic example, you see the only additions to text-to-image are

  • Load Upscale Model: This is for loading an AI upscaler model. (This node is in Add node > loaders)
  • Upscale image(using Model): The node now sits between the VAE decoder and the Save image node. It takes the image and the upscaler model. And outputs an upscaled image. (This node is in Add node > Image > upscaling)

To use this upscaler workflow, you must download an upscaler model from the Upscaler Wiki, and put it in the folder models > upscale_models.

Alternatively, set up ComfyUI to use AUTOMATIC1111’s model files.

Select an upscaler and click Queue Prompt to generate an upscaled image. The image should have been upscaled 4x by the AI upscaler.

Exercise: Recreate the AI upscaler workflow from text-to-image

It is a good exercise to make your first custom workflow by adding an upscaler to the default text-to-image workflow.

  1. Get back to the basic text-to-image workflow by clicking Load Default.

2. Right-click an empty space near Save Image. Select Add Node > loaders > Load Upscale Model.

3. Click on the dot on the wire between VAE Decode and Save Image. Click Delete to delete the wire.

4. Right-click on an empty space and select Add Node > image > upscaling > Upscale Image (using Model) to add the new node.

5. Drag and hold the UPSCALE_MODEL output of Load Upscale Model. Drop it at upscale_model of the Upscale Image (using Model) node.

6. Drag and hold the IMAGE output of the VAE Decode. Drop it at the image input of the Upscale Image (using Model).

7. Drag and hold the IMAGE output of the Upscale Image (uisng Model) node. Drop it at the images input of the Save Image node.

8. Click Queue Prompt to test the workflow.

Now you know how to make a new workflow. This skill comes in handy to make your own workflows.

Hi-res fix

Download the first image on this page and drop it in ComfyUI to load the Hi-Res Fix workflow.

This is a more complex example but also shows you the power of ComfyUI. After studying the nodes and edges, you will know exactly what Hi-Res Fix is.

The first part is identical to text-to-image: You denoise a latent image using a sampler, conditioned with your positive and negative prompts.

The workflow then upscales the image in the latent space and performs a few additional sampling steps. It adds some initial noise to the image and denoises it with a certain denoising strength.

The VAE decoder then decodes the larger latent image to produce an upscaled image.

SD Ultimate upscale – ComfyUI edition

SD Ultimate upscale is a popular upscaling extension for AUTOMATIC1111 WebUI. You can use it on ComfyUI too!

Github Page of SD Ultimate upscale for ComfyUI

This is also a good exercise for installing a custom node.

Installing the SD Ultimate upscale node

To install this custom node, go to the custom nodes folder in the PowerShell (Windows) or Terminal (Mac) App:

cd ComfyUI/custom_nodes

And clone the node to your local storage.

git clone https://github.com/ssitu/ComfyUI_UltimateSDUpscale --recursive

Restart ComfyUI completely.

Using SD Ultimate upscale

A good exercise is to start with the AI upscaler workflow. Add SD Ultimate Upscale and compare the result.

Load the AI upscaler workflow by dragging and dropping the image to ComfyUI or using the Load button to load.

Right-click on an empty space. Select Add Node > image > upscaling > Ultimate SD Upscale.

You should see the new node Ultimate SD Upscale. Wire up its input as follows.

  • image to VAE Decode’s IMAGE.
  • model to Load Checkpoint’s MODEL.
  • positive to CONDITIONING of the positive prompt box.
  • negative to CONDITIONING of the negative prompt box.
  • vae to Load Checkpoint’s VAE.
  • upscale_model to Load Upscale Model’s UPSCALE_MODEL.

For the output:

  • IMAGE to Save Image’s images.

If they are wired correctly, clicking Queue Prompt should show two large images, one with the AI upscaler and the other with Ultimate Upscale.

You can download this workflow example below. Drag and drop the image to ComfyUI to load.

ComfyUI Inpainting

You can use ComfyUI for inpainting. It is a basic technique to regenerate a part of the image.

I have to admit that inpainting is not the easiest thing to do with ComfyUI. But here you go…

Step 1: Open the inpaint workflow

Download the following inpainting workflow. Drag and drop the JSON file to ComfyUI.

Step 2: Upload an image

Pick an image that you want to inpaint.

Andy Lau is ready for inpainting.

You can download the image in PNG format here.

Upload it to the workflow.