| # Ovi FusionModel - FP8 Quantized | |
| This is the Ovi FusionModel quantized with FP8 (e4m3_e4m3_dynamic_per_tensor) for faster inference. | |
| ## Quantization Details | |
| - **Video Model Blocks**: 30 blocks quantized | |
| - **Audio Model Blocks**: 30 blocks quantized | |
| - **Attention/FFN layers**: e4m3_e4m3_dynamic_per_tensor | |
| - **Other layers**: e4m3_weightonly | |
| ## Usage | |
| ```python | |
| import sys | |
| import os | |
| import torch | |
| from omegaconf import OmegaConf | |
| from huggingface_hub import hf_hub_download | |
| OVI_PATH = "./workspace/Ovi" | |
| sys.path.insert(0, OVI_PATH) | |
| os.chdir(OVI_PATH) | |
| from ovi.ovi_fusion_engine import OviFusionEngine | |
| # Download quantized weights | |
| model_path = hf_hub_download( | |
| repo_id="wavespeed/Ovi-e4m3_e4m3_dynamic_per_tensor", | |
| filename="model.pth" | |
| ) | |
| config = OmegaConf.load("config.yaml") | |
| engine = OviFusionEngine(config=config, device="cuda", target_dtype=torch.bfloat16) | |
| # Load quantized weights | |
| engine.model.load_state_dict(torch.load(model_path)) | |
| # Model is already quantized, ready for inference | |
| ``` | |
| ## Model Card | |
| - **Developed by**: Alibaba/Character.AI | |
| - **Model type**: Video + Audio generation (FusionModel) | |
| - **Quantization**: FP8 (e4m3_e4m3_dynamic_per_tensor) | |
| - **License**: Check original Ovi repository | |
| ## Original Model | |
| Based on [Ovi](https://github.com/character-ai/Ovi) | |