Text-to-Speech
Transformers
Safetensors
fish_qwen3_omni
instruction-following
multilingual
quantized
fp8
comfyui
comfy
multi-turn
multi-speaker
sglang
Instructions to use fwwrsd/drbaph-s2-pro-fp8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use fwwrsd/drbaph-s2-pro-fp8 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="fwwrsd/drbaph-s2-pro-fp8")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("fwwrsd/drbaph-s2-pro-fp8", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| { | |
| "quantization_method": "torchao_Float8WeightOnly", | |
| "weight_dtype": "float8_e4m3fn", | |
| "scale_dtype": "float32", | |
| "scale_granularity": "per_row", | |
| "activation_dtype": "bfloat16", | |
| "torchao_version": "0.16.0", | |
| "torch_version": "2.9.0+cu128", | |
| "source_model": "fishaudio/s2-pro", | |
| "total_params_B": 4.562, | |
| "fp8_linear_params_B": 4.048, | |
| "bf16_other_params_B": 0.514, | |
| "output_size_GB": 6.16, | |
| "key_format": { | |
| "<layer_name>": "float8_e4m3fn quantized weight", | |
| "<layer_name>.scale": "float32 per-row dequantization scale", | |
| "_buf.<name>": "bf16/fp32 buffer (freqs_cis, causal_mask, etc.)", | |
| "other": "bfloat16 (embeddings, norms, non-linear layers)" | |
| }, | |
| "inference_requirements": { | |
| "torchao": ">= 0.8.0", | |
| "compute_capability": ">= 8.9 (RTX 4090 / 5090) for native FP8 matmuls" | |
| }, | |
| "notes": "All nn.Linear weights are float8_e4m3fn. Activations are bfloat16 (weight-only quantization). codec.pth is unchanged bfloat16." | |
| } |