| language: | |
| - en | |
| tags: | |
| - whisper | |
| - speech-recognition | |
| - trinidadian-creole | |
| - accent | |
| - asr | |
| license: mit | |
| model-index: | |
| - name: Accento V2.0 | |
| results: | |
| - task: | |
| type: automatic-speech-recognition | |
| metrics: | |
| - name: WER | |
| type: wer | |
| value: 19.94 | |
| - name: CER | |
| type: cer | |
| value: 10.00 | |
| # Accento V2.0 - Trinidadian Creole English ASR | |
| Accento V2.0 is a fine-tuned Whisper Large V3 Turbo model optimized for Trinidadian Creole English. | |
| ## Performance | |
| - **WER**: 19.94% (with beam_size=3) | |
| - **CER**: 10.00% | |
| - **54% better** than base Whisper | |
| - **27% better** than Accento V1.0 | |
| ## Usage | |
| ```python | |
| from accento import AccentoTranscriber | |
| # Auto-downloads from Hugging Face if not found locally | |
| transcriber = AccentoTranscriber(model_path="models/accento-v2.0") | |
| result = transcriber.transcribe("audio.wav") | |
| print(result.text) | |
| ``` | |
| ## Technical Details | |
| - **Base**: Whisper Large V3 Turbo (809M params) | |
| - **Method**: LoRA (rank=32, alpha=64) | |
| - **Adapters**: ~106M parameters | |
| - **Training**: 179 labeled samples + iterative training + model soups | |
| ## License | |
| MIT License | |