chengzeyi's picture
Add files using upload-large-folder tool
cb57d9f verified
# Ovi FusionModel - FP8 Quantized
This is the Ovi FusionModel quantized with FP8 (e4m3_e4m3_dynamic_per_tensor) for faster inference.
## Quantization Details
- **Video Model Blocks**: 30 blocks quantized
- **Audio Model Blocks**: 30 blocks quantized
- **Attention/FFN layers**: e4m3_e4m3_dynamic_per_tensor
- **Other layers**: e4m3_weightonly
## Usage
```python
import sys
import os
import torch
from omegaconf import OmegaConf
from huggingface_hub import hf_hub_download
OVI_PATH = "./workspace/Ovi"
sys.path.insert(0, OVI_PATH)
os.chdir(OVI_PATH)
from ovi.ovi_fusion_engine import OviFusionEngine
# Download quantized weights
model_path = hf_hub_download(
repo_id="wavespeed/Ovi-e4m3_e4m3_dynamic_per_tensor",
filename="model.pth"
)
config = OmegaConf.load("config.yaml")
engine = OviFusionEngine(config=config, device="cuda", target_dtype=torch.bfloat16)
# Load quantized weights
engine.model.load_state_dict(torch.load(model_path))
# Model is already quantized, ready for inference
```
## Model Card
- **Developed by**: Alibaba/Character.AI
- **Model type**: Video + Audio generation (FusionModel)
- **Quantization**: FP8 (e4m3_e4m3_dynamic_per_tensor)
- **License**: Check original Ovi repository
## Original Model
Based on [Ovi](https://github.com/character-ai/Ovi)