astra-atc-models / LLM /README.md
RanenSim's picture
feat: update ASR model, mark LLM as legacy
f338e91
metadata
language:
  - en
license: other
tags:
  - qwen3
  - text-generation
  - text2text-generation
  - air-traffic-control
  - atc
  - singapore
  - military
  - lora
  - unsloth
  - legacy
base_model: unsloth/Qwen3-1.7B

Qwen3-1.7B — ATC Display Text Formatter (Legacy)

Status: Legacy. This model has been superseded by a deterministic rule-based formatter (23 rules, <1ms, 0 VRAM) that achieves equivalent accuracy on all production ATC patterns. The rule-based formatter is now used exclusively in the ASTRA pipeline. This model is retained for reference and potential future use with novel/unseen patterns.

Fine-tuned Qwen3-1.7B that converts normalized ASR output into structured ATC display text. Designed to work downstream of the companion Whisper ASR model.

Performance

Metric Value
Exact match accuracy 100.0% (161/161)
Avg character edit distance 0.0
Best eval loss 0.0005

Why Legacy?

The rule-based formatter now handles all production patterns:

  • Speed: <1ms vs ~250ms per inference
  • VRAM: 0 GB vs ~3.3 GB
  • Determinism: 100% reproducible output, no sampling variance
  • Auditability: Each of the 23 rules is individually testable
  • Coverage: Handles all callsigns, locations, numeric patterns, and ATC abbreviations seen in training data

The LLM remains useful if novel patterns emerge that the rule-based system cannot handle.

Model Details

Key Value
Base model unsloth/Qwen3-1.7B
Method bf16 LoRA (rank 16, alpha 32)
Merged size 3.3 GB
Train examples 1,915
Eval examples 161
Thinking mode Disabled

Training

  • Framework: Unsloth + SFTTrainer (trl)
  • Optimizer: AdamW 8-bit
  • Learning rate: 1.2e-4
  • Effective batch size: 16
  • Precision: bf16
  • Packing: enabled
  • Train on responses only: yes
  • Converged at step 380 (epoch 3.2)

Dataset

1,670 unique ATC phrases from axite.json, stratified 90/10 split by category. Includes ASR noise augmentation (simulated ASR errors) for robustness.

What It Does

Converts normalized spoken text (ASR output) into structured display text:

Input (normalized) Output (display)
camel climb flight level zero nine zero CAMEL climb FL090
contact tengah approach one three zero decimal zero contact Tengah Approach 130.0
squawk seven seven zero zero squawk 7700
request clearance, ninja two f sixteens for western coast departure for i l s. Request clearance, NINJA 2xF16 for Western Coast Departure for ILS.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("path/to/LLM", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("path/to/LLM")

messages = [
    {"role": "system", "content": "Convert the following air traffic control transcript into structured display text."},
    {"role": "user", "content": "camel climb flight level zero nine zero"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.3, top_p=0.9, top_k=30)
result = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
# "CAMEL climb FL090"

Inference Settings

Parameter Value
Temperature 0.3
Top-p 0.9
Top-k 30
Max new tokens 128
Thinking Disabled (enable_thinking=False)