KevinNg99 commited on
Commit
21d6fbc
·
1 Parent(s): 0e916b7

update README

Browse files
Files changed (2) hide show
  1. README.md +29 -1
  2. README_CN.md +29 -1
README.md CHANGED
@@ -57,7 +57,8 @@ HunyuanVideo-1.5 is a video generation model that delivers top-tier quality with
57
  </p>
58
 
59
  ## 🔥🔥🔥 News
60
- * 🚀 Dec 05, 2025: **New Release**: We now release the [480p I2V step-distilled model](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_step_distilled), which generates videos in 8 or 12 steps (recommended)! On RTX 4090, end-to-end generation time is reduced by 75%, and a single RTX 4090 can generate videos within 75 seconds. The step-distilled model maintains comparable quality to the original model while achieving significant speedup. See [Step Distillation Comparison](./assets/step_distillation_comparison.md) for detailed quality comparisons. For even faster generation, you can also try 4 steps (faster speed with slightly reduced quality). **To enable the step-distilled model, run `generate.py` with the `--enable_step_distill` parameter.** See [Usage](#-usage) for detailed usage instructions. 🔥🔥🔥🆕
 
61
  * 📚 Dec 05, 2025: **Training Code Released**: We now open-source the training code for HunyuanVideo-1.5! The training script (`train.py`) provides a full training pipeline with support for distributed training, FSDP, context parallel, gradient checkpointing, and more. HunyuanVideo-1.5 is trained using the Muon optimizer, which we have open-sourced in the [Training](#-training) section. **If you would like to continue training our model or fine-tune it with LoRA, please use the Muon optimizer.** See [Training](#-training) section for detailed usage instructions. 🔥🔥🔥🆕
62
  * 🎉 **Diffusers Support**: HunyuanVideo-1.5 is now available on Hugging Face Diffusers! Check out [Diffusers collection](https://huggingface.co/collections/hunyuanvideo-community/hunyuanvideo-15) for easy integration. 🔥🔥🔥🆕
63
  * 🚀 Nov 27, 2025: We now support cache inference (deepcache, teacache, taylorcache), achieving significant speedup! Pull the latest code to try it. 🔥🔥🔥🆕
@@ -498,6 +499,11 @@ torchrun --nproc_per_node=8 train.py \
498
  | `--i2v_prob` | Probability of i2v task for video data | 0.3 |
499
  | `--use_muon` | Use Muon optimizer | true |
500
  | `--resume_from_checkpoint` | Resume from checkpoint directory | None |
 
 
 
 
 
501
 
502
  #### 4. Monitor Training
503
 
@@ -514,6 +520,28 @@ python train.py \
514
  --resume_from_checkpoint ./outputs/checkpoint-1000
515
  ```
516
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
517
 
518
  ## 📊 Evaluation
519
 
 
57
  </p>
58
 
59
  ## 🔥🔥🔥 News
60
+ * 🚀 Dec 09, 2025: LoRA tuning script is released, enjoy it! 🔥🔥🔥🆕
61
+ * 🚀 Dec 05, 2025: **New Release**: We now release the [480p I2V step-distilled model](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_step_distilled), which generates videos in 8 or 12 steps (recommended)! On RTX 4090, end-to-end generation time is reduced by 75%, and a single RTX 4090 can generate videos within **75 seconds**. The step-distilled model maintains comparable quality to the original model while achieving significant speedup. See [Step Distillation Comparison](./assets/step_distillation_comparison.md) for detailed quality comparisons. For even faster generation, you can also try 4 steps (faster speed with slightly reduced quality). **To enable the step-distilled model, run `generate.py` with the `--enable_step_distill` parameter.** See [Usage](#-usage) for detailed usage instructions. 🔥🔥🔥🆕
62
  * 📚 Dec 05, 2025: **Training Code Released**: We now open-source the training code for HunyuanVideo-1.5! The training script (`train.py`) provides a full training pipeline with support for distributed training, FSDP, context parallel, gradient checkpointing, and more. HunyuanVideo-1.5 is trained using the Muon optimizer, which we have open-sourced in the [Training](#-training) section. **If you would like to continue training our model or fine-tune it with LoRA, please use the Muon optimizer.** See [Training](#-training) section for detailed usage instructions. 🔥🔥🔥🆕
63
  * 🎉 **Diffusers Support**: HunyuanVideo-1.5 is now available on Hugging Face Diffusers! Check out [Diffusers collection](https://huggingface.co/collections/hunyuanvideo-community/hunyuanvideo-15) for easy integration. 🔥🔥🔥🆕
64
  * 🚀 Nov 27, 2025: We now support cache inference (deepcache, teacache, taylorcache), achieving significant speedup! Pull the latest code to try it. 🔥🔥🔥🆕
 
499
  | `--i2v_prob` | Probability of i2v task for video data | 0.3 |
500
  | `--use_muon` | Use Muon optimizer | true |
501
  | `--resume_from_checkpoint` | Resume from checkpoint directory | None |
502
+ | `--use_lora` | Enable LoRA fine-tuning | false |
503
+ | `--lora_r` | LoRA rank | 8 |
504
+ | `--lora_alpha` | LoRA alpha scaling parameter | 16 |
505
+ | `--lora_dropout` | LoRA dropout rate | 0.0 |
506
+ | `--pretrained_lora_path` | Path to pretrained LoRA adapter | None |
507
 
508
  #### 4. Monitor Training
509
 
 
520
  --resume_from_checkpoint ./outputs/checkpoint-1000
521
  ```
522
 
523
+ #### 6. LoRA Fine-tuning
524
+
525
+ To enable LoRA fine-tuning, add `--use_lora` to your training command. LoRA adapters will be saved in the checkpoint directory under `lora/`:
526
+
527
+ ```bash
528
+ torchrun --nproc_per_node=8 train.py \
529
+ --pretrained_model_root ./ckpts \
530
+ --use_lora \
531
+ --lora_r 8 \
532
+ --lora_alpha 16 \
533
+ --learning_rate 1e-4 \
534
+ --output_dir ./outputs
535
+ ```
536
+
537
+ To load a pretrained LoRA adapter, use `--pretrained_lora_path`:
538
+ ```bash
539
+ torchrun --nproc_per_node=8 train.py \
540
+ --pretrained_model_root ./ckpts \
541
+ --use_lora \
542
+ --pretrained_lora_path ./outputs/checkpoint-1000/lora/default
543
+ ```
544
+
545
 
546
  ## 📊 Evaluation
547
 
README_CN.md CHANGED
@@ -40,7 +40,8 @@ HunyuanVideo-1.5作为一款轻量级视频生成模型,仅需83亿参数即
40
  </p>
41
 
42
  ## 🔥🔥🔥 最新动态
43
- * 🚀 Dec 05, 2025: **新模型发布**:我们现已发布 [480p I2V 步数蒸馏模型](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_step_distilled),建议使用 8 或 12 步生成视频!在 RTX 4090 上,端到端生成耗时减少 75%,单卡 RTX 4090 可在 75 秒内生成视频。步数蒸馏模型在保持与原模型相当质量的同时实现了显著的加速。详细的质量对比请参见[步数蒸馏对比文档](./assets/step_distillation_comparison.md)。如需更快的生成速度,您也可以尝试使用4步推理(速度更快,质量略有下降)。**启用步数蒸馏模型,请运行 `generate.py` 并使用 `--enable_step_distill` 参数。** 详细的使用说明请参见[使用方法](#-使用方法)。 🔥🔥🔥🆕
 
44
  * 📚 Dec 05, 2025: **训练代码已发布**:我们现已开源 HunyuanVideo-1.5 的完整训练代码!训练脚本(`train.py`)提供了完整的训练流程,支持分布式训练、FSDP、context parallel、梯度检查点等功能。HunyuanVideo-1.5 使用 Muon 优化器进行训练,我们在[训练](#-训练)部分已开源。**如果您希望继续训练我们的模型,或使用 LoRA 进行微调,请使用 Muon 优化器。** 详细使用说明请参见[训练](#-训练)部分。 🔥🔥🔥🆕
45
  * 🎉 **Diffusers 支持**:HunyuanVideo-1.5 现已支持 Hugging Face Diffusers!查看我们的 [Diffusers 集合](https://huggingface.co/collections/hunyuanvideo-community/hunyuanvideo-15) 以便轻松集成。 🔥🔥🔥🆕
46
  * 🚀 Nov 27, 2025: 我们现已支持 cache 推理(deepcache, teacache, taylorcache),可极大加速推理!请 pull 最新代码体验。 🔥🔥🔥🆕
@@ -481,6 +482,11 @@ torchrun --nproc_per_node=8 train.py \
481
  | `--i2v_prob` | 视频数据使用 i2v 任务的概率 | 0.3 |
482
  | `--use_muon` | 使用 Muon 优化器 | true |
483
  | `--resume_from_checkpoint` | 从检查点目录恢复训练 | None |
 
 
 
 
 
484
 
485
  #### 4. 监控训练
486
 
@@ -497,6 +503,28 @@ python train.py \
497
  --resume_from_checkpoint ./outputs/checkpoint-1000
498
  ```
499
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
500
 
501
  ## 📊 性能评估
502
  ### 评分
 
40
  </p>
41
 
42
  ## 🔥🔥🔥 最新动态
43
+ * 🚀 Dec 09, 2025: LoRA 微调脚本已发布,欢迎使用! 🔥🔥🔥🆕
44
+ * 🚀 Dec 05, 2025: **新模型发布**:我们现已发布 [480p I2V 步数蒸馏模型](https://huggingface.co/tencent/HunyuanVideo-1.5/tree/main/transformer/480p_i2v_step_distilled),建议使用 8 或 12 步生成视频!在 RTX 4090 上,端到端生成耗时减少 75%,单卡 RTX 4090 可在 **75 秒**内生成视频。步数蒸馏模型在保持与原模型相当质量的同时实现了显著的加速。详细的质量对比请参见[步数蒸馏对比文档](./assets/step_distillation_comparison.md)。如需更快的生成速度,您也可以尝试使用4步推理(速度更快,质量略有下降)。**启用步数蒸馏模型,请运行 `generate.py` 并使用 `--enable_step_distill` 参数。** 详细的使用说明请参见[使用方法](#-使用方法)。 🔥🔥🔥🆕
45
  * 📚 Dec 05, 2025: **训练代码已发布**:我们现已开源 HunyuanVideo-1.5 的完整训练代码!训练脚本(`train.py`)提供了完整的训练流程,支持分布式训练、FSDP、context parallel、梯度检查点等功能。HunyuanVideo-1.5 使用 Muon 优化器进行训练,我们在[训练](#-训练)部分已开源。**如果您希望继续训练我们的模型,或使用 LoRA 进行微调,请使用 Muon 优化器。** 详细使用说明请参见[训练](#-训练)部分。 🔥🔥🔥🆕
46
  * 🎉 **Diffusers 支持**:HunyuanVideo-1.5 现已支持 Hugging Face Diffusers!查看我们的 [Diffusers 集合](https://huggingface.co/collections/hunyuanvideo-community/hunyuanvideo-15) 以便轻松集成。 🔥🔥🔥🆕
47
  * 🚀 Nov 27, 2025: 我们现已支持 cache 推理(deepcache, teacache, taylorcache),可极大加速推理!请 pull 最新代码体验。 🔥🔥🔥🆕
 
482
  | `--i2v_prob` | 视频数据使用 i2v 任务的概率 | 0.3 |
483
  | `--use_muon` | 使用 Muon 优化器 | true |
484
  | `--resume_from_checkpoint` | 从检查点目录恢复训练 | None |
485
+ | `--use_lora` | 启用 LoRA 微调 | false |
486
+ | `--lora_r` | LoRA rank | 8 |
487
+ | `--lora_alpha` | LoRA alpha 缩放参数 | 16 |
488
+ | `--lora_dropout` | LoRA dropout 率 | 0.0 |
489
+ | `--pretrained_lora_path` | 预训练 LoRA 适配器路径 | None |
490
 
491
  #### 4. 监控训练
492
 
 
503
  --resume_from_checkpoint ./outputs/checkpoint-1000
504
  ```
505
 
506
+ #### 6. LoRA 微调
507
+
508
+ 启用 LoRA 微调,在训练命令中添加 `--use_lora`。LoRA 适配器将保存在检查点目录的 `lora/` 子目录下:
509
+
510
+ ```bash
511
+ torchrun --nproc_per_node=8 train.py \
512
+ --pretrained_model_root ./ckpts \
513
+ --use_lora \
514
+ --lora_r 8 \
515
+ --lora_alpha 16 \
516
+ --learning_rate 1e-4 \
517
+ --output_dir ./outputs
518
+ ```
519
+
520
+ 加载预训练的 LoRA 适配器,使用 `--pretrained_lora_path`:
521
+ ```bash
522
+ torchrun --nproc_per_node=8 train.py \
523
+ --pretrained_model_root ./ckpts \
524
+ --use_lora \
525
+ --pretrained_lora_path ./outputs/checkpoint-1000/lora/default
526
+ ```
527
+
528
 
529
  ## 📊 性能评估
530
  ### 评分