How to speed up Wan 2.1 Video with Teacache and Sage Attention

Updated Categorized as Tutorial Tagged , , 4 Comments on How to speed up Wan 2.1 Video with Teacache and Sage Attention

Wan 2.1 Video is a state-of-the-art AI model that you can use locally on your PC. However, it does take some time to generate a high-quality 720p video, and it can take a lot of time to refine a video through multiple generations.

This fast Wan 2.1 workflow uses Teache and Sage Attention to reduce generation time by 30%. It will help you iterate through multiple videos with significant time-saving.

Software

We will use ComfyUI, an alternative to AUTOMATIC1111. You can use it on Windows, Mac, or Google Colab. If you prefer using a ComfyUI service, Think Diffusion offers our readers an extra 20% credit.

Read the ComfyUI beginner’s guide if you are new to ComfyUI. See the Quick Start Guide if you are new to AI images and videos.

Take the ComfyUI course to learn how to use ComfyUI step by step.

How does the speed-up work?

This workflow uses two speed-up techniques: Teacache and Sage Attention.

Teacache

TeaCache takes advantage of the observation that some neural network blocks don’t do much during sampling. Researchers have recognized that diffusion models generate image outlines in the initial sampling steps and fill in details in the late steps.

Diffusion models generate the image outline in the initial steps and details in the late steps. (Image: Chen et. al.)

TeaCache intelligently determines when to use caches during sampling. It uses the cached output when the current input is similar to that produced the cache. It only recomputes the cache when the input becomes substantially different. You can control how often the cache is recomputed by a threshold value.

See also: TeaCache: 2x speed up in ComfyUI

Sage Attention

Sage Attention speeds up transformer attention operations by quantizing the computation. Instead of full precision, it uses lower precision (like 8-bit or 4-bit) in the key parts of the attention operation. It can speed up many AI models with nearly lossless accuracy.

Google Colab

If you use my ComfyUI Colab notebook, select the following before running the notebook.

  • WAN_2_1 video models
  • WAN_2_1 custom nodes
  • VideoHelperSuite custom nodes

Fast Wan 2.1 Teacache and Sage Attention workflow

This fast Wan 2.1 workflow uses KJNodes‘ Sage Attention and Teacache nodes. It is ~30% faster than the standard Wan 2.1 workflow.

The two speed-up nodes are placed between the Load Diffusion Model and the KSampler node.

Step 1: Update ComfyUI

Before loading the workflow, make sure your ComfyUI is up to date. The easiest way to do this is to use ComfyUI Manager.

Click the Manager button on the top toolbar.

Select Update ComfyUI.

comfyui manager - update comfyui

Restart ComfyUI.

Step 2: Download model files

Download the diffusion model wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors and put it in ComfyUI > models > diffusion_models.

Download the text encoder model umt5_xxl_fp8_e4m3fn_scaled.safetensors and put it in ComfyUI > models > text_encoders.

Download the CLIP vision model clip_vision_h.safetensors and put it in ComfyUI > models > clip_vision.

Download the Wan VAE model wan_2.1_vae.safetensors and put it in ComfyUI > models > vae.

Step 3: Load the fast Wan 2.1 workflow

Download the workflow JSON file below and drop it to ComfyUI to load.

Step 4: Install missing nodes

If you see red blocks, you don’t have the custom node that this workflow needs.

Click Manager > Install missing custom nodes and install the missing nodes.

Restart ComfyUI.

Step 5: Install trition and sage attention

The Sage Attention node requires the trition and sage attention packages that do not come with the JK Nodes.

For Windows users, navigate to the Python folder of your ComfyUI.

For the Windows portable version, it is ComfyUI_windows_portable > ComfyUI_windows_portable.

Enter cmd in the address bar and press Enter.

You should see the command prompt.

Enter the following command to install triton.

python -m pip install triton-windows

Enter the following command to install sage attention.

python -m pip install sageattention

Step 6: Set the image image

Upload an image you wish to use as the video’s initial frame. You can download my test image for testing.

Step 7: Revise the prompt

Revise the positive prompt to describe the video you want to generate.

Don’t forget to add motion keywords, e.g. Running.

Step 8: Generate the video

Click the Queue button to run the workflow.

queue button comfyui

You should get this video.

Andrew

By Andrew

Andrew is an experienced software engineer with a specialization in Machine Learning and Artificial Intelligence. He is passionate about programming, art, and education. He has a doctorate degree in engineering.

4 comments

  1. Thanks, Andrew, this looks like it will be a significant improvement. But on Colab, Comfy isn’t finding the sage attention module.

Leave a comment

Your email address will not be published. Required fields are marked *