Flux-CogVideo Text-to-Video workflow (ComfyUI)

Updated Categorized as Workflow Tagged , , , 6 Comments on Flux-CogVideo Text-to-Video workflow (ComfyUI)

This ComfyUI workflow uses the Flux AI model to generate a high-quality image, followed by an image-to-video generation with CogVideo. The result is a video with a text prompt and quality higher than the original CogVideo text-to-video model.

You need to be a member of this site to download the ComfyUI workflow.

Software

We will use ComfyUI, an alternative to AUTOMATIC1111.

Read the ComfyUI installation guide and ComfyUI beginner’s guide if you are new to ComfyUI.

Take the ComfyUI course to learn ComfyUI step-by-step.

Step-by-step guide

Step 1: Download the Flux AI model

Download the Flux1 dev FP8 checkpoint.

Put the model file in the folder ComfyUI > models > checkpoints.

Step 2: Download the Flux text encoder

Download the t5xxl_fp8_e4m3fn text encoder model.

Put the model file in the folder ComfyUI > models > clip.

Step 3: Update ComfyUI

ComfyUI started supporting Flux natively in August 2024. You should do so now if you haven’t updated your ComfyUI since then.

The easiest way to update ComfyUI is through the ComfyUI Manager. Click Manager > Update All.

Make sure to reload the ComfyUI page after the update — Clicking the restart button is not enough.

Step 4: Disable Smart Memory

Add the argument --disable-smart-memory to the launch file run_nvidia_gpu.bat. This option forces ComfyUI to unload the model from VRAM as much as possible. Otherwise, you will run out of memory.

Restart ComfyUI.

Step 5: Load the workflow

Download the Flux-CogVideo Video workflow below. (You must be a member to download the workflow.)

Become a member of this site to see this content

Already a member? Log in here.

Drag and drop the JSON file to ComfyUI.

Step 6: Install missing nodes

Click Manager > Install Missing Custom Nodes.

Install the nodes that are missing.

Restart ComfyUI.

Step 7: Run the workflow.

Press Queue Prompt to generate a video.

Running the workflow for the first time takes a while because it needs to download the CogVideo Image-to-Video model.

Usage tips

It is best to treat the video generation as a 2-step process.

  1. Refine the prompt to generate a good image.
  2. Change CogVideo’s seed to refine the video.

Select the first node in the CogVideo section, “Resize Image“.

Press Ctrl-M to mute the node. You should see the node gray out.

Now, you can change the prompt and seed and run the workflow without generating the video, which takes a long time.

Once you get an image you like, unmute the node by selecting it and press Ctrl-M.

Rerun the workflow.

If you don’t like the video, change the seed in the CogVideo Sampler node of the Image-to-Video section to generate a new video.

Andrew

By Andrew

Andrew is an experienced software engineer with a specialization in Machine Learning and Artificial Intelligence. He is passionate about programming, art, and education. He has a doctorate degree in engineering.

6 comments

  1. Hi, thanks for this workflow. I am running it on Mac and image generation works well, but the CogVideo Sampler Note stays red no matter what and won’t process, any idea?

    Thanks

  2. Hey Andrew, love your work, I learned so much from your tutorials!
    I’m trying to run this tutorial (and Flux in general) on ComfyUI but it keep crashing because HeaderTooLarge. I guess I didn’t do this correctly:
    “Add the argument –disable-smart-memory to the launch file run_nvidia_gpu.bat.”
    This is added by default in the “Extra ComfyUI arguments”, but I can’t find the .bat file.
    Any idea? Thanks!

    1. The argument is already set if you are using the ComfyUI Colab notebook. This workflow uses a lot of memory as you have noticed. You can change the runtime type to L4 which has a higher VRAM.

Leave a comment

Your email address will not be published. Required fields are marked *