| --- |
| language: |
| - en |
| license: other |
| tags: |
| - whisper |
| - qwen3 |
| - ctranslate2 |
| - automatic-speech-recognition |
| - text-generation |
| - air-traffic-control |
| - atc |
| - singapore |
| - military |
| pipeline_tag: automatic-speech-recognition |
| --- |
| |
| # ASTRA ATC Models |
|
|
| Fine-tuned models for Singapore military air traffic control, built for the [ASTRA](https://github.com/aether-raid) training simulator. |
|
|
| ## Pipeline |
|
|
| ``` |
| Audio --> VAD (Silero) --> ASR (Whisper) --> Rule Formatter --> Display Text |
| "camel climb flight level zero nine zero" |
| "CAMEL climb FL090" |
| ``` |
|
|
| The production pipeline uses a **rule-based formatter** (23 deterministic rules, <1ms, 0 VRAM) instead of the LLM. The LLM is retained for reference. |
|
|
| ## Models |
|
|
| ### [ASR/](./ASR) — Whisper Large v3 (CTranslate2 float16) |
|
|
| Fine-tuned for Singapore military ATC speech. Uses CTranslate2 float16 format for fast inference with [faster-whisper](https://github.com/SYSTRAN/faster-whisper). |
|
|
| | Metric | Value | |
| |--------|-------| |
| | WER | **0.66%** | |
| | Base model | `openai/whisper-large-v3` | |
| | Size | 2.9 GB | |
| | Training | Full fine-tune with enhanced VHF radio augmentation | |
|
|
| ### [LLM/](./LLM) — Qwen3-1.7B Display Formatter (Legacy) |
|
|
| > **Legacy.** Superseded by a deterministic rule-based formatter. Retained for reference. |
|
|
| Converts normalized ASR output into structured ATC display text. |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Exact match | **100%** (161/161) | |
| | Base model | `unsloth/Qwen3-1.7B` | |
| | Size | 3.3 GB | |
|
|
| ## Architecture |
|
|
| ``` |
| Audio --> VAD (Silero) --> ASR (Whisper ct2) --> Post-processing --> Rule Formatter --> Display Text |
| ``` |
|
|
| | Component | Technology | Latency | VRAM | |
| |-----------|-----------|---------|------| |
| | VAD | Silero VAD (ONNX) | ~50ms | <100 MB | |
| | ASR | Whisper Large v3 (CTranslate2) | ~500ms-2s | ~2 GB | |
| | Formatter | 23 deterministic rules | <1ms | 0 MB | |
|
|
| Total VRAM: ~2 GB (ASR only). |
|
|
| ## Domain |
|
|
| Singapore military ATC covering: |
| - **Airbases**: Tengah (WSAT, runway 18/36), Paya Lebar (WSAP, runway 02/20) |
| - **Aircraft**: F-16C/D, F-15SG, C-130, Hercules |
| - **Approaches**: ILS, GCA, PAR, TACAN, DVOR/DME, VOR/DME, Visual Straight-in |
| - **100+ callsigns**: CAMEL, NINJA, BEETLE, TAIPAN, MAVERICK, JAGUAR, LANCER, etc. |
| - **Categories**: departure, approach, handoff, maneuver, landing, emergency, ground, recovery, pilot reports, military-specific ops |
|
|
| ## Training History |
|
|
| ### ASR |
|
|
| | Run | WER | Base | Key Change | |
| |-----|-----|------|------------| |
| | ct2_run5 | 0.48% | jacktol/whisper-large-v3-finetuned-for-ATC | Initial fine-tune | |
| | ct2_run6 | 0.40% | jacktol/whisper-large-v3-finetuned-for-ATC | +augmentation, weight decay | |
| | ct2_run7 | 0.24% | jacktol/whisper-large-v3-finetuned-for-ATC | Frozen encoder, +50 real recordings | |
| | **ct2_run8** | **0.66%** | openai/whisper-large-v3 | Full retrain from base, enhanced augmentation | |
| |
| > ct2_run8 trains from the original Whisper base for better generalisation to real-world ATC audio. |
|
|
| ### LLM (Legacy) |
|
|
| | Run | Accuracy | Key Change | |
| |-----|----------|------------| |
| | llm_run3 | 98.1% (Qwen3-8B) | QLoRA 4-bit, 871 examples | |
| | llm_run4 | 100% (Qwen3-1.7B) | bf16 LoRA, 1,915 examples with ASR noise augmentation | |
|
|
| ## Quick Start |
|
|
| ### ASR |
|
|
| ```python |
| from faster_whisper import WhisperModel |
| |
| model = WhisperModel("./ASR", device="cuda", compute_type="float16") |
| segments, info = model.transcribe("audio.wav", language="en", beam_size=5) |
| text = " ".join(seg.text.strip() for seg in segments) |
| ``` |
|
|
| ### Download |
|
|
| ```bash |
| # Full repo (ASR + LLM) |
| huggingface-cli download aether-raid/astra-atc-models --local-dir ./models |
| |
| # ASR only (recommended) |
| huggingface-cli download aether-raid/astra-atc-models --include "ASR/*" --local-dir ./models |
| |
| # LLM only (legacy) |
| huggingface-cli download aether-raid/astra-atc-models --include "LLM/*" --local-dir ./models |
| ``` |
|
|