🚦 tachyon-router-v1-mini
Turkish prompt complexity classifier for intelligent LLM routing.
Fine-tuned from dbmdz/electra-small-turkish-cased-discriminator — 13.7M parameters, 26.2MB (FP16).
The idea is simple: instead of letting users pick a model manually, this classifier reads the prompt and decides which model tier is appropriate — saving cost on simple queries while still routing complex ones to capable models.
This model was built with heavy AI assistance. The dataset was generated by Gemini 2.5 Flash, the training pipeline was written with Claude, and most architectural decisions were made through back-and-forth with AI tools — not from a formal ML background. That said, every output was reviewed, the metrics are real, and nothing was blindly copy-pasted. Call it AI-assisted rather than vibe-coded, but transparency feels right either way. Use with appropriate skepticism.
🏷️ Labels
| Label | Name | Score Range | Description |
|---|---|---|---|
0 |
trivial |
0.05 – 0.22 | Greetings, navigation, password reset, simple UI questions. No data retrieval needed. |
1 |
simple |
0.25 – 0.52 | Single-step data lookup or filter. One table, one condition, no aggregation. |
2 |
analysis |
0.55 – 0.78 | Multi-step reasoning: comparisons, summaries, trend analysis, cross-table queries. |
3 |
complex |
0.80 – 0.98 | Forecasting, risk analysis, scenario modeling, root cause analysis, strategic recommendations. |
📊 Metrics
Evaluated on a held-out validation set (n=210, balanced across classes).
| Metric | Score |
|---|---|
| Accuracy | 0.90 |
| Macro F1 | 0.90 |
| Macro Precision | 0.90 |
| Macro Recall | 0.91 |
Per-class breakdown:
| Class | Precision | Recall | F1 |
|---|---|---|---|
| trivial | 0.88 | 1.00 | 0.94 |
| simple | 0.97 | 0.88 | 0.92 |
| analysis | 0.89 | 0.84 | 0.86 |
| complex | 0.86 | 0.92 | 0.89 |
Confusion matrix:
trivial simple analysis complex
trivial 37 0 0 0
simple 5 57 2 1
analysis 0 2 47 7
complex 0 0 4 48
🚀 Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Squezilia/tachyon-router-v1-mini"
)
classifier("Bu ayki toplam ciro ne kadar?")
# [{'label': 'simple', 'score': 0.86}]
classifier("Önümüzdeki çeyrek için nakit akışı tahmini yap ve risk senaryolarını göster.")
# [{'label': 'complex', 'score': 0.76}]
With confidence threshold (recommended)
Low-confidence predictions are nudged to the next tier — safer for routing:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
LABEL_NAMES = {0: "trivial", 1: "simple", 2: "analysis", 3: "complex"}
CONF_THRESHOLD = 0.65
tokenizer = AutoTokenizer.from_pretrained("Squezilia/tachyon-router-v1-mini")
model = AutoModelForSequenceClassification.from_pretrained("Squezilia/tachyon-router-v1-mini")
model.eval()
def route(text: str) -> dict:
enc = tokenizer(text, return_tensors="pt", max_length=128,
padding="max_length", truncation=True)
with torch.no_grad():
logits = model(**enc).logits
probs = torch.softmax(logits, dim=1).squeeze()
pred = int(probs.argmax())
conf = float(probs[pred])
# Bump up one tier if confidence is low
if conf < CONF_THRESHOLD and pred < 3:
pred += 1
return {"label": pred, "label_name": LABEL_NAMES[pred], "confidence": conf}
route("Geçen ayla kıyasla gelir-gider özeti çıkar.")
# {'label': 2, 'label_name': 'analysis', 'confidence': 0.66}
🏗️ Training
Data
The training dataset was generated using Gemini 2.5 Flash with structured outputs, targeting Turkish business/ERP software prompts. It went through several iterations:
- v1: 1,104 examples — continuous regression labels (scores 0.0–1.0). The model learned to predict ~0.4 for everything (classic regression collapse on noisy labels).
- v2: Converted to 4-class classification, deduplicated, balanced to 250 examples per class with augmentation. Trained with
dbmdz/distilbert-base-turkish-cased(80M params) → 0.77 accuracy. - v3: Switched to
dbmdz/electra-small-turkish-cased-discriminator(13.7M params), expanded dataset to ~1,539 examples including long/complex prompts → 0.88 accuracy. - v4 (this model): Further expanded to ~1,539 examples with better domain coverage, tuned epochs and batch size → 0.90 accuracy. Exported to FP16 → 26.2MB.
Augmentation
To balance underrepresented classes, two light augmentation strategies were applied to the training set:
- Synonym replacement — domain-aware Turkish synonym swaps (e.g.
göster → listele,karşılaştır → kıyasla) - Word dropout — random word removal for sentences with 5+ words (p=0.10)
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | dbmdz/electra-small-turkish-cased-discriminator |
| Epochs | 24 (early stopping at 15) |
| Batch size | 16 |
| Learning rate | 2e-5 |
| Warmup ratio | 0.1 |
| Weight decay | 0.01 |
| Max length | 128 |
| Loss | CrossEntropyLoss with class weights |
| Early stopping patience | 3 (monitored: val macro F1) |
| Precision | FP16 |
🌍 Domain Coverage
The model was trained on prompts spanning these business domains:
- Muhasebe & Finans — gelir, gider, fatura, vergi, bilanço
- Stok & Depo — ürün, stok, sipariş, tedarik
- Satış & Müşteri — ciro, müşteri, kampanya, tahsilat
- İnsan Kaynakları — personel, maaş, izin, performans
- Üretim & Operasyon — vardiya, maliyet, kapasite, fire
- Sistem & Kullanıcı — şifre, yetki, ayar, bildirim
⚠️ Limitations
- Trained exclusively on Turkish business/ERP prompts — not suitable for general-purpose or non-Turkish text.
- The
analysis/complexboundary is the hardest to classify (semantically close). The confidence threshold nudge helps mitigate misrouting. - Dataset was synthetically generated — real-world distribution may differ.
📦 Model Files
| File | Size | Description |
|---|---|---|
model.safetensors |
55 MB | FP32 weights |
model_fp16.safetensors |
27.5 MB | FP16 weights (recommended) |
model.onnx |
55.2 MB | ONNX export (FP32) |
model_fp16.onnx |
27.8 MB | ONNX export (FP16, for browser/edge) |
config.json |
1.25 kB | Model config |
config_fp16.json |
1.25 kB | FP16 model config |
config_onnx.json |
980 B | ONNX model config |
tokenizer.json |
755 kB | Tokenizer |
tokenizer_config.json |
342 B | Tokenizer config |
Built for a personal ERP assistant project. Feedback welcome.
- Downloads last month
- 105