Beginner’s Guide to ComfyUI

Published Categorized as Tutorial Tagged , No Comments on Beginner’s Guide to ComfyUI
ComfyUI.
What you would look like after using ComfyUI for real.

ComfyUI is a node-based GUI for Stable Diffusion. This tutorial is for someone who hasn’t used ComfyUI before. I will covers

  • Text-to-image
  • Image-to-image
  • SDXL workflow
  • Inpainting
  • Using LoRAs
  • ComfyUI Manager – managing custom nodes in GUI.
  • Impact Pack – a collection of useful ComfyUI nodes.

See this post for a guide to installing ComfyUI.

What is ComfyUI?

ComfyUI is a node-based GUI for Stable Diffusion. You can construct an image generation workflow by chaining different blocks (called nodes) together.

Some commonly used blocks are Loading a Checkpoint Model, entering a prompt, specifying a sampler, etc. ComfyUI breaks down a workflow into rearrangeable elements so you can easily make your own.

ComfyUI vs AUTOMATIC1111

AUTOMATIC1111 is the de facto GUI for Stable Diffusion.

Should you use ComfyUI instead of AUTOMATIC1111? Here’s a comparison.

The benefits of using ComfyUI are:

  1. Lightweight: it runs fast.
  2. Flexible: very configurable.
  3. Transparent: The data flow is in front of you.
  4. Easy to share: Each file is a reproducible workflow.
  5. Good for prototyping: Prototyping with a graphic interface instead of coding.

The drawbacks of using ComfyUI are:

  1. Inconsistent interface: Each workflow may place the nodes differently. You need to figure out what to change.
  2. Too much detail: Average users don’t need to know how things are wired under the hood. (Isn’t it the whole point of using a GUI?)
  3. Lack of inpainting tool: Inpainting must be done with an external program.

Where to start?

The best way to learn ComfyUI is by going through examples. So, we will learn how to do things in ComfyUI in the simplest text-to-image workflow.

We will go through some basic workflow examples. After studying some essential ones, you will start to understand how to make your own.

At the end of this tutorial, you will have the opportunity to make a pretty involved one. The answer will be provided.

Basic controls

Use the mouse wheel or two-finger pinch to zoom in and out.

Drag and hold the dot of the input or output to form a connection. You can only connect between input and output of the same type.

Hold and drag with the left click to move around the workspace.

Press Ctrl-0 (Windows) or Cmd-0 (Mac) to show the Queue panel.

Text-to-image

Let’s first go through the simplest case: generating an image from text.

Classical, right?

By going through this example, you will also learn the idea before ComfyUI (It’s very different from Automatic1111 WebUI). As a bonus, you will know more about how Stable Diffusion works!

Generating your first image on ComfyUI

After starting ComfyUI for the very first time, you should see the default text-to-image workflow. It should look like this:

If this is not what you see, click Load Default on the right panel to return this default text-to-image workflow.

If you don’t see the right panel, press Ctrl-0 (Windows) or Cmd-0 (Mac).

You will see the workflow is made with two basic building blocks: Nodes and edges.

Nodes are the rectangular blocks, e.g., Load Checkpoint, Clip Text Encoder, etc. Each node executes some code. If you have some programming experience, you can think of them as functions. Each node needs three things

  • Inputs are the texts and dots on the left that the wires come in.
  • Outputs are the texts and dots on the right the wires go out.
  • Parameters are the fields at the center of the block.

Edges are the wires connecting the outputs and the inputs between nodes.

That’s the whole idea! The rest are details.

Don’t worry if the jargon on the nodes looks daunting. We will walk through a simple example of using ComfyUI, introduce some concepts, and gradually move on to more complicated workflows.

Below is the simplest way you can use ComfyUI. You should be in the default workflow.

1. Selecting a model

First, select a Stable Diffusion Checkpoint model in the Load Checkpoint node. Click on the model name to show a list of available models.

If the node is too small, you can use the mouse wheel or pinch with two fingers on the touchpad to zoom in and out.

If clicking the model name does nothing, you may not have installed a model or configured it to use your existing models in A1111. Go back to the installation guide to fix it first.

2. Enter a prompt and a negative prompt

You should see two nodes labeled CLIP Text Encode (Prompt). Enter your prompt in the top one and your negative prompt in the bottom one.

The CLIP Text Enode node first converts the prompt into tokens and then encodes them into embeddings with the text encoder.

You can use the syntax (keyword:weight) to control the weight of the keyword. E.g. (keyword:1.2) to increase its effect. (keyword:0.8) to decrease its effect.

Why is the top one the prompt? Look at the CONDITIONING output. It is connected to the positive input of the KSampler node. The bottom one is connected to the negative, so it is for the negative prompt.

3. Generate an image

Click Queue Prompt to run the workflow. After a short wait, you should see the first image generated.

What has just happened?

The advantage of using ComfyUI is that it is very configurable. It is worth learning what each node does so you can use them to suit your needs.

You can skip the rest of this section if you are not interested in the theory.

Load Checkpoint node

Use the Load Checkpoint node to select a model. A Stable Diffusion model has three main parts:

  1. MODEL: The noise predictor model in the latent space.
  2. CLIP: The language model preprocesses the positive and the negative prompts.
  3. VAE: The Variational AutoEncoder converts the image between the pixel and the latent spaces.

The MODEL output connects to the sampler, where the reverse diffusion process is done.

The CLIP output connects to the prompts because the prompts need to be processed by the CLIP model before they are useful.

In text-to-image, VAE is only used in the last step: Converting the image from the latent to the pixel space. In other words, we are only using the decoder part of the autoencoder.

CLIP Text Encode

The CLIP text encode node gets the prompt and feeds it into the CLIP language model. CLIP is OpenAI’s language model, transforming each word in a prompt into embeddings.

Empty latent image

A text-to-image process starts with a random image in the latent space.

The size of the latent image is proportional to the actual image in the pixel space. So, if you want to change the size of the image, you change the size of the latent image.

You set the height and the weight to change the image size in pixel space.

Here, you can also set the batch size, which is how many images you generate in each run.

KSampler

KSampler is at the heart of image generation in Stable Diffusion. A sampler denoises a random image into one that matches your prompt.

KSampler refers to samplers implemented in this code repository.

Here are the parameters in the KSampler node.

  • Seed: The random seed value controls the initial noise of the latent image and, hence, the composition of the final image.
  • Control_after_generation: How the seed should change after each generation. It can either be getting a random value (randomize), increasing by 1 (increment), decreasing by 1 (decrement), or unchanged (fixed).
  • Step: Number of sampling steps. The higher, the fewer artifacts in the numerical process.
  • Sampler_name: Here, you can set the sampling algorithm. Read the sampler article for a primer.
  • Scheduler: Controls how the noise level should change in each step.
  • Denoise: How much of the initial noise should be erased by the denoising process. 1 means all.

Image-to-image workflow

The Img2img workflow is another staple workflow in Stable Diffusion. It generates an image based on the prompt AND an input image.

You can adjust the denoising strength to control how much Stable Diffusion should follow the base image.

Download the image-to-image workflow

Drag and drop this workflow image to ComfyUI to load.

comfyUI img2img workflow.

To use this img2img workflow:

  1. Select the checkpoint model.
  2. Revise the positive and the negative prompts.
  3. Optionally adjust the denoise (denoising strength) in the KSampler node.
  4. Press Queue Prompt to start generation.

ComfyUI Manager

ComfyUI manager is a custom node that lets you install and update other custom nodes through the ComfyUI interface.

Installing ComfyUI Manager

To install this custom node, go to the custom nodes folder in the PowerShell (Windows) or Terminal (Mac) App:

cd ComfyUI/custom_nodes

And clone the node to your local storage.

git clone https://github.com/ltdrdata/ComfyUI-Manager

Restart ComfyUI completely.

Using ComfyUI Manager

After the installation, you should see an extra Manager button on the Queue Prompt menu. Clicking it shows a GUI that lets you

The Install Missing Nodes function is especially useful for finding what custom nodes that are required in the current workflow.

The Install Custom Nodes menu lets you manage custom nodes. You can uninstall or disable an installed node or install a new one.

ComfyUI manager.

Upscaling

There are several ways to upscale in Stable Diffusion. For teaching purposes, let’s go through upscaling with

  1. an AI upscaler
  2. Hi res fix
  3. Ultimate Upscale

AI upscale

An AI upscaler is an AI model for enlarging images while filling in details. They are not Stable Diffusion models but neural networks trained for enlarging images.

Load this upscaling workflow by first downloading the image on the page. Drag and drop the image to ComfyUI.

Tip: Dragging and dropping an image made with ComfyUI loads the workflow that produces it.

AI upscaler in the upscaling workflow in ComfyUI.

In this basic example, you see the only additions to text-to-image are

  • Load Upscale Model: This is for loading an AI upscaler model.
  • Upscale image(using Model): The node now sits between the VAE decoder and the Save image node. It takes the image and the upscaler model. And outputs an upscaled image.

To use this upscaler workflow, you must download an upscaler model from the Upscaler Wiki, and put it in the folder models > upscale_models.

Alternatively, set up ComfyUI to use AUTOMATIC1111’s model files.

Select an upscaler and click Queue Prompt to generate an upscaled image. The image should have been upscaled 4x by the AI upscaler.

Exercise: Recreate the AI upscaler workflow from text-to-image

It is a good exercise to make your first custom workflow by adding an upscaler to the default text-to-image workflow.

  1. Get back to the basic text-to-image workflow by clicking Load Default.

2. Right-click an empty space near Save Image. Select Add Node > loaders > Load Upscale Model.

3. Click on the dot on the wire between VAE Decode and Save Image. Click Delete to delete the wire.

4. Right-click on an empty space and select Add Node > image > upscaling > Upscale Image (using Model) to add the new node.

5. Drag and hold the UPSCALE_MODEL output of Load Upscale Model. Drop it at upscale_model of the Upscale Image (using Model) node.

6. Drag and hold the IMAGE output of the VAE Decode. Drop it at the image input of the Upscale Image (using Model).

7. Drag and hold the IMAGE output of the Upscale Image (uisng Model) node. Drop it at the images input of the Save Image node.

8. Click Queue Prompt to test the workflow.

Now you know how to make a new workflow. This skill comes in handy to make your own workflows.

Hi-res fix

Download the first image on this page and drop it in ComfyUI to load the Hi-Res Fix workflow.

This is a more complex example but also shows you the power of ComfyUI. After studying the nodes and edges, you will know exactly what Hi-Res Fix is.

The first part is identical to text-to-image: You denoise a latent image using a sampler, conditioned with your positive and negative prompts.

The workflow then upscales the image in the latent space and performs a few additional sampling steps. It adds some initial noise to the image and denoises it with a certain denoising strength.

The VAE decoder then decodes the larger latent image to produce an upscaled image.

SD Ultimate upscale – ComfyUI edition

SD Ultimate upscale is a popular upscaling extension for AUTOMATIC1111 WebUI. You can use it on ComfyUI too!

Github Page of SD Ultimate upscale for ComfyUI

This is also a good exercise for installing a custom node.

Installing the SD Ultimate upscale node

To install this custom node, go to the custom nodes folder in the PowerShell (Windows) or Terminal (Mac) App:

cd ComfyUI/custom_nodes

And clone the node to your local storage.

git clone https://github.com/ssitu/ComfyUI_UltimateSDUpscale --recursive

Restart ComfyUI completely.

Using SD Ultimate upscale

A good exercise is to start with the AI upscaler workflow. Add SD Ultimate Upscale and compare the result.

Load the AI upscaler workflow by dragging and dropping the image to ComfyUI or using the Load button to load.

Right-click on an empty space. Select Add Node > image > upscaling > Ultimate SD Upscale.

You should see the new node Ultimate SD Upscale. Wire up its input as follows.

  • image to VAE Decode’s IMAGE.
  • model to Load Checkpoint’s MODEL.
  • positive to CONDITIONING of the positive prompt box.
  • negative to CONDITIONING of the negative prompt box.
  • vae to Load Checkpoint’s VAE.
  • upscale_model to Load Upscale Model’s UPSCALE_MODEL.

For the output:

  • IMAGE to Save Image’s images.

If they are wired correctly, clicking Queue Prompt should show two large images, one with the AI upscaler and the other with Ultimate Upscale.

You can download this workflow example below. Drag and drop the image to ComfyUI to load.

ComfyUI Inpainting

You can use ComfyUI for inpainting. It is a basic technique to regenerate a part of the image.

I have to admit that inpainting is not the easiest thing to do with ComfyUI. But here you go…

Step 1: Create an inpaint mask

First, pick an image that you want to inpaint.

Andy Lau is ready for inpainting.

You can download the image in PNG format here.

We will use Photopea, a free online Photoshop clone, to create the inpaint mask. The mask needs to be painted in the Alpha channel of a PNG file.

Drag and drop the PNG image to Photopea.

Select the Eraser Tool (Press E).

Draw the mask by erasing part of the image.

Save it as a PNG file. Click File > Export > PNG.

Step 2: Open the inpainting workflow

To use inpainting, first download the inpainting workflow.

Load the inpainting workflow in ComfyUI by dropping to it.

Step 3: Upload the image

Upload the image with the mask to the Load Image node.

Step 4: Adjust parameters

Change the prompt:

x men Cyclops sun glasses, epic style, super hero

The original denoising strength (denoise) is too high. Set it to 0.8.

Step 5: Generate inpainting

Finally, press the Queue Prompt to perform inpainting.

This is quite an ordeal for a small task… So, I will stick with AUTOMATIC1111 for inpainting.

SDXL workflow

ComfyUI Stable Diffusion XL workflow.
Simple SDXL workflow.

Because of its extremely configurability, ComfyUI is one of the first GUIs that make the Stable Diffusion XL model work.

Download the Simple SDXL workflow for ComfyUI. Drag and drop the image to ComfyUI to load.

You will need to change

  • Positive Prompt
  • Negative Prompt

That’s it!

There are a few more complex SDXL workflows on this page.

ComfyUI Impact Pack

ComfyUI Impact pack is a pack of free custom nodes that greatly enhance what ComfyUI can do.

There are more custom nodes in the Impact Pact than I can write about in this article. See the official tutorials to learn them one by one. Read through the beginner tutorials if you want to use this set of nodes effectively.

Install

To install the ComfyUI Impact Pack, first open the PowerShell App (Windows) or the Terminal App (Mac or Linux).

cd custom_nodes

Clone the Impact Pack to your local storage.

git clone https://github.com/ltdrdata/ComfyUI-Impact-Pack.git

Clone Workflow Component that is needed for Impact Pack.

git clone https://github.com/ltdrdata/ComfyUI-Workflow-Component

Restart ComfyUI completely.

Regenerate faces

You can use this workflow in the Impact Pack to regenerate faces with the Face Detailer custom node and SDXL base and refiner models. Download and drop the JSON file into ComfyUI.

To use this workflow, you will need to set

  • The initial image in the Load Image node.
  • An SDXL base model in the upper Load Checkpoint node.
  • An SDXL refiner model in the lower Load Checkpoint node.
  • The prompt and negative prompt for the new images.

Click Queue Prompt to start the workflow.

Andy Lau’s face doesn’t need any fix (Did he??). So I used a prompt to turn him into a K-pop star.

a closeup photograph of a korean k-pop star man

Only the face changes, while the background and everything else stays the same.

LoRA

LoRA is a small model file modifying a checkpoint model. It is frequently used for modifying styles or injecting a person into the model.

In fact, the modification of LoRA is clear in ComfyUI:

The LoRA model changes the MODEL and CLIP of the checkpoint model but leaves the VAE untouched.

Simple LoRA workflows

This is the simplest LoRA workflow possible: Text-to-image with a LoRA and a checkpoint model.

Download the simple LoRA workflow

To use the workflow:

  1. Select a checkpoint model.
  2. Select a LoRA.
  3. Revise the prompt and the negative prompt.
  4. Click Queue Prompt.

Multiple LoRAs

You can use two LoRAs in the same text-to-image workflow.

Download the two-LoRA workflow

The usage is similar to one LoRA, but now you must pick two.

The two LoRAs are applied one after the other.

Exercise: Make a workflow to compare with and without LoRA

To be good at ComfyUI, you really need to make your own workflows.

A good exercise is to create a workflow to compare text-to-image with and without a LoRA while keeping everything else the same.

To achieve this, you need to know how to share parameters between two nodes.

Sharing parameters between two nodes

Let’s use the same seed in two K-Samplers.

They have their own seed values. To use the same seed value between the two, right-click on the node and select convert seed to input.

You should get a new input node called seed.

Right-click on an empty space. Select Add node > utils > Primitive. Connect the primitive node to the two seed inputs.

Now, you have a single seed value sharing between the two samplers.

Workflow to compare images with and without LoRA

Using this technique alone, you can modify the single LoRA example to make a workflow comparing the effect of LoRA while keeping everything else the same.

Comparing the effect of Epilson offset LoRA. Top: with LoRA. Bottom: without LoRA.

You can download the answer below.

Useful resources

Official ComfyUI tutorial – A graphical tutorial. Very basic.

ComfyUI Examples – A bunch of example workflows you can download.

ComfyUI Community Manual – A reference manual.


If you find the content helpful, please support this site by becoming a member.

Buy Me A Coffee

By Andrew

Andrew is an experienced engineer with a specialization in Machine Learning and Artificial Intelligence. He is passionate about programming, art, photography, and education. He possesses a Ph.D. in engineering.

Leave a comment

Your email address will not be published. Required fields are marked *