Wan 2.1 Video is a state-of-the-art AI model that you can use locally on your PC. However, it does take some time to generate a high-quality 720p video, and it can take a lot of time to refine a video through multiple generations.
This fast Wan 2.1 workflow uses Teache and Sage Attention to reduce generation time by 30%. It will help you iterate through multiple videos with significant time-saving.
Table of Contents
Software
We will use ComfyUI, an alternative to AUTOMATIC1111. You can use it on Windows, Mac, or Google Colab. If you prefer using a ComfyUI service, Think Diffusion offers our readers an extra 20% credit.
Read the ComfyUI beginner’s guide if you are new to ComfyUI. See the Quick Start Guide if you are new to AI images and videos.
Take the ComfyUI course to learn how to use ComfyUI step by step.
How does the speed-up work?
This workflow uses two speed-up techniques: Teacache and Sage Attention.
Teacache
TeaCache takes advantage of the observation that some neural network blocks don’t do much during sampling. Researchers have recognized that diffusion models generate image outlines in the initial sampling steps and fill in details in the late steps.

TeaCache intelligently determines when to use caches during sampling. It uses the cached output when the current input is similar to that produced the cache. It only recomputes the cache when the input becomes substantially different. You can control how often the cache is recomputed by a threshold value.
See also: TeaCache: 2x speed up in ComfyUI
Sage Attention
Sage Attention speeds up transformer attention operations by quantizing the computation. Instead of full precision, it uses lower precision (like 8-bit or 4-bit) in the key parts of the attention operation. It can speed up many AI models with nearly lossless accuracy.
Google Colab
If you use my ComfyUI Colab notebook, select the following before running the notebook.
- WAN_2_1 video models
- WAN_2_1 custom nodes
- VideoHelperSuite custom nodes


Fast Wan 2.1 Teacache and Sage Attention workflow
This fast Wan 2.1 workflow uses KJNodes‘ Sage Attention and Teacache nodes. It is ~30% faster than the standard Wan 2.1 workflow.
The two speed-up nodes are placed between the Load Diffusion Model and the KSampler node.

Step 1: Update ComfyUI
Before loading the workflow, make sure your ComfyUI is up to date. The easiest way to do this is to use ComfyUI Manager.
Click the Manager button on the top toolbar.

Select Update ComfyUI.

Restart ComfyUI.
Step 2: Download model files
Download the diffusion model wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors and put it in ComfyUI > models > diffusion_models.
Download the text encoder model umt5_xxl_fp8_e4m3fn_scaled.safetensors and put it in ComfyUI > models > text_encoders.
Download the CLIP vision model clip_vision_h.safetensors and put it in ComfyUI > models > clip_vision.
Download the Wan VAE model wan_2.1_vae.safetensors and put it in ComfyUI > models > vae.
Step 3: Load the fast Wan 2.1 workflow
Download the workflow JSON file below and drop it to ComfyUI to load.

Step 4: Install missing nodes
If you see red blocks, you don’t have the custom node that this workflow needs.
Click Manager > Install missing custom nodes and install the missing nodes.
Restart ComfyUI.
Step 5: Install trition and sage attention
The Sage Attention node requires the trition and sage attention packages that do not come with the JK Nodes.
For Windows users, navigate to the Python folder of your ComfyUI.
For the Windows portable version, it is ComfyUI_windows_portable > ComfyUI_windows_portable.

Enter cmd in the address bar and press Enter.

You should see the command prompt.
Enter the following command to install triton.
python -m pip install triton-windows
Enter the following command to install sage attention.
python -m pip install sageattention
Step 6: Set the image image
Upload an image you wish to use as the video’s initial frame. You can download my test image for testing.

Step 7: Revise the prompt
Revise the positive prompt to describe the video you want to generate.

Don’t forget to add motion keywords, e.g. Running.
Step 8: Generate the video
Click the Queue button to run the workflow.

You should get this video.
Thanks, Andrew, this looks like it will be a significant improvement. But on Colab, Comfy isn’t finding the sage attention module.
Hi David, I added the KJNodes. See the updated instruction on this page.
I’m having the same issue, but with a local windows install
If triton fails for you check this: https://github.com/woct0rdho/triton-windows/issues/83