--- license: mit base_model: vibevoice/VibeVoice-7B tags: - tts - text-to-speech - speech-synthesis - norwegian - bokmal language: - "no" - nb --- # Prat-9B (preview) A Norwegian (Bokmal) text-to-speech model fine-tuned for the Østnorsk/Oslo dialect. This model is currently in preview, You can expect things like weird artefacts, But generally, per our testing, it outperforms VibeVoice 7B per our unscientific qualitative eval. ## Usage ```python from transformers import AutoProcessor, AutoModel import torch processor = AutoProcessor.from_pretrained("heiertech/Prat-9B") model = AutoModel.from_pretrained("heiertech/Prat-9B", torch_dtype=torch.bfloat16) # Generate speech text = "Hei, dette er en test av den norske stemmen." inputs = processor(text=text, return_tensors="pt") outputs = model.generate(**inputs) ``` ## Base Model This model is based on [VibeVoice-7B](https://huggingface.co/vibevoice/VibeVoice-7B). Note that despite the name, VibeVoice-7B is actually a 9B parameter model. The 7B only refers to the size of the llm backbone based on Qwen2.5 7B ## Acknowledgments - Base model: [vibevoice/VibeVoice-7B](https://huggingface.co/vibevoice/VibeVoice-7B) - Training data: Mozilla Common Voice Norwegian