OmniVoice BR-PT v1.5
OmniVoice BR-PT v1.5 is the selected Brazilian Portuguese fine-tune of k2-fsa/OmniVoice. This release uses the v1.5 gold-subset refinement checkpoint selected from checkpoint-9000.
It is optimized for Brazilian Portuguese TTS experiments, voice-design, and voice-cloning workflows using OmniVoice.
Why This Checkpoint
We evaluated base OmniVoice, the first BR-PT run, and v1.5 refine checkpoints on a 20-prompt Brazilian Portuguese ASR intelligibility set. Lower is better.
| Model | WER | CER |
|---|---|---|
| base OmniVoice | 0.0456 | 0.0400 |
| v1 checkpoint-19000 | 0.0456 | 0.0400 |
| v1.5 checkpoint-5000 | 0.0400 | 0.0379 |
| v1.5 checkpoint-8000 | 0.0463 | 0.0404 |
| v1.5 checkpoint-9000 | 0.0400 | 0.0379 |
| v1.5 checkpoint-10000 | 0.0471 | 0.0390 |
checkpoint-9000 was selected because it tied for the best ASR score while being later in the refinement run than checkpoint-5000.
Samples
Reference audio was provided for sample generation and is included for reproducibility:
01 Welcome
Prompt: Bom dia. Esta é uma demonstração em português brasileiro com voz clara e natural.
02 Creator
Prompt: Hoje eu vou mostrar uma novidade rápida, simples e muito fácil de acompanhar.
03 Brazil
Prompt: São Paulo amanheceu com chuva, mas a cidade continuou cheia de energia.
04 Food
Prompt: Eu gostaria de um café sem açúcar e um pão de queijo bem quentinho, por favor.
06 Product
Prompt: Este produto foi feito para criadores que precisam gravar conteúdos todos os dias.
07 Numbers
Prompt: O preço final ficou em cinquenta e três reais, com entrega para amanhã de manhã.
08 Story
Prompt: No fim da tarde, a música começou baixinho e todo mundo ficou em silêncio para ouvir.
Training Details
- Base model:
k2-fsa/OmniVoice - Initial v1 checkpoint used for refinement:
checkpoint-19000 - Selected release checkpoint: v1.5
checkpoint-9000 - Dataset:
edwixx/brazilian-portuguese-TTS - Gold refine set: 12,014 train rows, 260 dev rows after filtering
- Speakers in selected gold subset: 40
- Language ID used for OmniVoice compatibility:
pt - Locale metadata:
pt-BR - Instruction used in refine data/samples:
portuguese accent - Refine LR:
5e-6 - Refine steps:
10,000 - Selected checkpoint:
9,000 - Final refine eval loss at step 10,000:
3.93625
Usage
omnivoice-infer --model edwixx/omnivoice-brpt-v15 --text "Bom dia. Esta é uma demonstração em português brasileiro." --language pt --instruct "portuguese accent" --output brpt_v15.wav
Voice cloning example:
omnivoice-infer --model edwixx/omnivoice-brpt-v15 --text "Hoje eu vou mostrar uma novidade rápida e fácil de acompanhar." --language pt --instruct "portuguese accent" --ref_audio reference.wav --output brpt_clone.wav
Limitations
This is an experimental OmniVoice fine-tune. The model uses OmniVoice's generic Portuguese language code (pt) internally; Brazilian Portuguese behavior comes from the fine-tuning data, the gold subset, and consistent prompting. Human listening is still required to judge accent, naturalness, and voice-cloning quality.
Do not use this model for impersonation, deception, fraud, harassment, or cloning voices without consent.
Files
This repo intentionally excludes optimizer and random-state files to keep the release cleaner. Reports are in reports/, and audio examples are in samples/.
Attribution
- Base model/code:
k2-fsa/OmniVoice - Dataset:
edwixx/brazilian-portuguese-TTS
- Downloads last month
- 27
Model tree for edwixx/omnivoice-brpt-v15
Dataset used to train edwixx/omnivoice-brpt-v15
Evaluation results
- Post-refine ASR WER, 20 prompt set on edwixx/brazilian-portuguese-TTSself-reported0.040
- Post-refine ASR CER, 20 prompt set on edwixx/brazilian-portuguese-TTSself-reported0.038