Text-to-Video
Wan2.2
English
Chinese
custom
ti2v
text-to-audio-video
audio-video-generation
mmdit
flow-matching
Instructions to use baidu/NAVA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Wan2.2
How to use baidu/NAVA with Wan2.2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| { | |
| "patch_size": [1, 2, 2], | |
| "model_type": "ti2v", | |
| "dim": 3072, | |
| "ffn_dim": 14336, | |
| "freq_dim": 256, | |
| "num_heads": 24, | |
| "num_layers": 30, | |
| "num_double_layers": 10, | |
| "num_single_layers": 20, | |
| "vid_in_dim": 48, | |
| "vid_out_dim": 48, | |
| "audio_in_dim": 128, | |
| "audio_out_dim": 128, | |
| "text_len": 512, | |
| "window_size": [-1, -1], | |
| "qk_norm": true, | |
| "cross_attn_norm": true, | |
| "eps": 1e-6, | |
| "temporal_rope_scaling_factor": 0.24 | |
| } |