Whisper ATC-CO Medium

PEFT/LoRA fine-tuned Whisper-medium for Colombian Air Traffic Control (ATC) radio transcription.

Model Details

Training Parameters

Parameter Value
Epochs 15
Learning rate 1e-3
Batch size 4
Gradient accumulation 4
Warmup steps 25
LoRA rank (r) 32
LoRA alpha 64
LoRA dropout 0.05
Target modules q_proj, v_proj
Train samples 122
Validation samples 15

Evaluation (test split, 16 samples)

Metric Baseline (whisper-medium) Finetuned Delta
WER raw 0.96 0.73 -0.23
CER raw 0.83 0.57 -0.26
WER norm 0.92 0.65 -0.27
CER norm 0.81 0.53 -0.28

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor
from peft import PeftModel

# Load base model + adapter
base = WhisperForConditionalGeneration.from_pretrained("openai/whisper-medium")
model = PeftModel.from_pretrained(base, "cjamcu/whisper-atc-co-medium", subfolder="adapter_model")
processor = WhisperProcessor.from_pretrained("cjamcu/whisper-atc-co-medium")

# Transcribe
import torch
audio, sr = ...  # load 16kHz mono audio
inputs = processor(audio, sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
    generated = model.generate(inputs.input_features)
transcription = processor.batch_decode(generated, skip_special_tokens=True)[0]

Limitations

  • Trained on 122 samples; performance may vary on out-of-distribution ATC audio
  • Test set has only 16 samples; metrics have high variance
  • Spanish/English mixed ATC communications

Citation

@misc{whisper-atc-co-medium,
  author = {ATColombia},
  title = {Whisper ATC-CO Medium: Colombian ATC transcription},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/cjamcu/whisper-atc-co-medium}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cjamcu/whisper-atc-co-medium

Adapter
(103)
this model

Dataset used to train cjamcu/whisper-atc-co-medium