metadata
license: mit
base_model: vibevoice/VibeVoice-7B
tags:
- tts
- text-to-speech
- speech-synthesis
- norwegian
- bokmal
language:
- 'no'
- nb
Prat-9B (preview)
A Norwegian (Bokmal) text-to-speech model fine-tuned for the Østnorsk/Oslo dialect. This model is currently in preview, You can expect things like weird artefacts, But generally, per our testing, it outperforms VibeVoice 7B per our unscientific qualitative eval.
Usage
from transformers import AutoProcessor, AutoModel
import torch
processor = AutoProcessor.from_pretrained("heiertech/Prat-9B")
model = AutoModel.from_pretrained("heiertech/Prat-9B", torch_dtype=torch.bfloat16)
# Generate speech
text = "Hei, dette er en test av den norske stemmen."
inputs = processor(text=text, return_tensors="pt")
outputs = model.generate(**inputs)
Base Model
This model is based on VibeVoice-7B. Note that despite the name, VibeVoice-7B is actually a 9B parameter model. The 7B only refers to the size of the llm backbone based on Qwen2.5 7B
Acknowledgments
- Base model: vibevoice/VibeVoice-7B
- Training data: Mozilla Common Voice Norwegian