| | --- |
| | base_model: THUDM/CogVideoX-2b |
| | library_name: peft |
| | tags: |
| | - lora |
| | - dora |
| | - cogvideox |
| | - physics |
| | - video-generation |
| | - warp |
| | --- |
| | |
| | # PDW — Physics-Corrected CogVideoX-2b World Model (DoRA Adapter) |
| |
|
| | A **DoRA (Weight-Decomposed Low-Rank Adaptation)** adapter for [CogVideoX-2b](https://huggingface.co/THUDM/CogVideoX-2b), fine-tuned to generate physically accurate videos using **NVIDIA Warp** physics simulation data and **TRD (Temporal Representation Distillation)** with DINOv2-large as teacher. |
| |
|
| | ## Model Details |
| |
|
| | - **Base model:** THUDM/CogVideoX-2b (1.7B params) |
| | - **Adapter:** DoRA (r=16, lora_alpha=32, use_dora=True) |
| | - **Target modules:** `to_q`, `to_k`, `to_v`, `to_out.0` |
| | - **Trainable params:** 7.6M / 1.7B (0.45%) |
| | - **Physics engine:** NVIDIA Warp (28-scenario 7×4 grid) |
| | - **TRD teacher:** DINOv2-large |
| | - **Hardware:** NVIDIA H100 NVL |
| | - **Training steps:** 400 |
| |
|
| | ## Evaluation Results |
| |
|
| | | Metric | Delta | |
| | |---|---| |
| | | Diffusion MSE | +94.1% | |
| | | Motion score | +1.7% | |
| | | Overall | +47.9% | |
| |
|
| | ## How to Use |
| |
|
| | ```python |
| | from peft import PeftModel |
| | from diffusers import CogVideoXTransformer3DModel |
| | |
| | # Load base transformer |
| | base_transformer = CogVideoXTransformer3DModel.from_pretrained( |
| | "THUDM/CogVideoX-2b", subfolder="transformer" |
| | ) |
| | |
| | # Load DoRA adapter |
| | model = PeftModel.from_pretrained(base_transformer, "athul020/pdw_final_dora") |
| | ``` |
| |
|
| | ### Framework versions |
| |
|
| | - PEFT 0.18.1 |