This tutorial will provide a comprehensive guide on using Tencent’s Hunyuan Video model in ComfyUI for text-to-video generation. We will walk you through the entire process step by step.
1. Install and Update ComfyUI to Latest Version
If you haven’t installed ComfyUI yet, please refer to these sections:
HunyuanVideo supports the following resolution settings:
Resolution
9:16 Ratio
16:9 Ratio
4:3 Ratio
3:4 Ratio
1:1 Ratio
540p
544×960×129f
960×544×129f
624×832×129f
832×624×129f
720×720×129f
720p (Recommended)
720×1280×129f
1280×720×129f
1104×832×129f
832×1104×129f
960×960×129f
4. Workflow Node Explanation
4.1 Model Loading Nodes
UNETLoader
Purpose: Load the main model file
Parameters:
Model: hunyuan_video_t2v_720p_bf16.safetensors
Weight Type: default (can choose fp8 type if memory is insufficient)
DualCLIPLoader
Purpose: Load text encoder models
Parameters:
CLIP 1: clip_l.safetensors
CLIP 2: llava_llama3_fp8_scaled.safetensors
Text Encoder: hunyuan_video
VAELoader
Purpose: Load VAE model
Parameters:
VAE Model: hunyuan_video_vae_bf16.safetensors
4.2 Key Video Generation Nodes
EmptyHunyuanLatentVideo
Purpose: Create video latent space
Parameters:
Width: Video width (e.g., 848)
Height: Video height (e.g., 480)
Frame Count: Number of frames (e.g., 73)
Batch Size: Batch size (default 1)
CLIPTextEncode
Purpose: Text prompt encoding
Parameters:
Text: Positive prompts (describe what you want to generate)
Recommended to use detailed English descriptions
FluxGuidance
Purpose: Control generation guidance strength
Parameters:
Guidance Scale: Guidance strength (default 6.0)
Higher values make results closer to prompts but may affect video quality
KSamplerSelect
Purpose: Select sampler
Parameters:
Sampler: Sampling method (default euler)
Other options: euler_ancestral, dpm++_2m, etc.
BasicScheduler
Purpose: Set sampling scheduler
Parameters:
Scheduler: Scheduling method (default simple)
Steps: Sampling steps (recommended 20-30)
Denoise: Denoising strength (default 1.0)
4.3 Video Decoding and Saving Nodes
VAEDecodeTiled
Purpose: Decode latent space video to actual video
Parameters:
Tile Size: 256 (can be reduced if memory is insufficient)
Overlap: 64 (can be reduced if memory is insufficient)
Note: Prefer VAEDecodeTiled over VAEDecode as it’s more memory efficient
SaveAnimatedWEBP
Purpose: Save generated video
Parameters:
Filename Prefix: File name prefix
FPS: Frame rate (default 24)
Lossless: Whether lossless (default false)
Quality: Quality (0-100, default 80)
Filter Type: Filter type (default default)
5. Parameter Optimization Tips
5.1 Memory Optimization
If encountering memory issues:
Choose fp8 weight type in UNETLoader
Reduce tile_size and overlap parameters in VAEDecodeTiled
Use lower video resolution and frame count
5.2 Generation Quality Optimization
Prompt Optimization[Subject Description], [Action Description], [Scene Description], [Style Description], [Quality Requirements]Example:anime style anime girl with massive fennec ears and one big fluffy tail, she has blonde hair long hair blue eyes wearing a pink sweater and a long blue skirt walking in a beautiful outdoor scenery with snow mountains in the background
Parameter Adjustments
Increase sampling steps for better quality
Appropriately increase Guidance Scale for better text adherence