KEA Omnilingual-ASR β€” Kabuverdianu Fine-Tune

Fine-tuned version of omniASR_CTC_300M for Kabuverdianu (kea) β€” Cape Verdean Creole, a low-resource language spoken by ~1 million people.

Model Details

Property Value
Base model omniASR_CTC_300M (Meta, ~300M parameters)
Language Kabuverdianu (kea_Latn)
Task Automatic Speech Recognition (CTC)
Framework omnilingual-asr (fairseq2)
Dataset shunyalabs/kabuverdianu-speech-dataset

Training Data

Split Samples
Train 2715
Validation 366
Test 864

Training Hyperparameters

Parameter Value
Training steps N/A
Learning rate N/A
Mixed precision N/A
Gradient accumulation N/A batches
Encoder frozen for N/A steps
Min audio length N/A samples (2 s)
Max audio length N/A samples (60 s)
Max elements per batch N/A

Evaluation Results

Metric Value
WER (validation) N/A
Train CTC Loss N/A
Validation CTC Loss N/A

Checkpoint Structure

checkpoint/sdp_00.pt   ← model weights (fairseq2 sharded format)
model.yaml             ← architecture definition
config.yaml            ← full training configuration
metrics/train.jsonl    ← per-step training metrics
metrics/valid.jsonl    ← per-step validation metrics (WER)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Iscte-Sintra/KeaASR-CTC-300M_v2