metadata
language:
- en
- ar
- hi
- fr
- es
- zh
- sw
- pt
tags:
- text-classification
- intent-classification
- routing
- multilingual
- onnx
license: mit
datasets:
- rufatronics/ueg-training-data
metrics:
- accuracy
- f1
model-index:
- name: UEG Classifier v1
results:
- task:
type: text-classification
dataset:
name: UEG Training Data
type: rufatronics/ueg-training-data
metrics:
- type: accuracy
value: 0.9735
- type: f1
value: 0.9733
UEG — Universal Edge Gateway Classifier
A 35M parameter bidirectional transformer for intent classification and AI request routing.
Model Description
UEG classifies incoming user text into 22 intent classes across 5 routing tiers, plus a secondary 5-class language resource density classification. Both outputs come from a single forward pass.
- Architecture: 6-layer bidirectional transformer encoder, 512 hidden dim, 8 attention heads
- Parameters: ~35M
- Max sequence length: 128 tokens
- Tokenizer: Custom BPE trained on the UEG training corpus (32K vocab)
- Languages: English, Arabic, Hindi, French, Spanish, Chinese, Swahili, Portuguese
Performance
| Head | Accuracy | Macro F1 |
|---|---|---|
| Intent (22 classes) | 97.35% | 0.9733 |
| Resource Density (5 classes) | 99.95% | 0.9987 |
Usage
Via REST API (recommended)
import requests
r = requests.post("https://ueg-api.onrender.com/classify",
json={"text": "Write a Python function to sort a list"})
print(r.json())
Via ONNX Runtime
import onnxruntime as ort
import numpy as np
from tokenizers import Tokenizer
from huggingface_hub import hf_hub_download
# Load tokenizer
tok_path = hf_hub_download("rufatronics/ueg-classifier",
"tokenizer/tokenizer.json")
tokenizer = Tokenizer.from_file(tok_path)
tokenizer.enable_padding(pad_id=0, pad_token="[PAD]", length=128)
tokenizer.enable_truncation(max_length=128)
# Load ONNX model + data file (both needed)
onnx_path = hf_hub_download("rufatronics/ueg-classifier",
"export/ueg_model.onnx")
data_path = hf_hub_download("rufatronics/ueg-classifier",
"export/ueg_model.onnx.data")
sess = ort.InferenceSession(onnx_path,
providers=["CPUExecutionProvider"])
# Inference
enc = tokenizer.encode("Write a Python function to reverse a string")
ids = np.array([enc.ids], dtype=np.int64)
mask = np.array([enc.attention_mask], dtype=np.int64)
logits_intent, logits_resource = sess.run(None,
{"input_ids": ids, "attention_mask": mask})
intent_class = np.argmax(logits_intent)
Files
| File | Description |
|---|---|
checkpoint_best.pt |
PyTorch weights (best validation epoch) |
checkpoint_latest.pt |
PyTorch weights (final epoch) |
export/ueg_model.onnx |
ONNX model for production inference |
export/ueg_model.onnx.data |
ONNX external data (required alongside .onnx) |
export/config.json |
Architecture hyperparameters |
export/benchmark.json |
Inference latency benchmark |
tokenizer/tokenizer.json |
Tokenizer definition |
tokenizer/tokenizer_config.json |
Tokenizer config with pad/cls/sep IDs |
labels/intent_classes.json |
Intent class label mappings |
labels/resource_classes.json |
Resource density class mappings |
Training
Trained from scratch on 176K synthetic examples generated via the UEG data generation pipeline using Groq, Gemini, and Mistral free tiers. Three-phase training: warmup → cosine decay → head fine-tuning. Early stopping with patience=4.
Full training code: https://github.com/rufatronics/ueg-datagen
Citation
@misc{ueg2026,
title={UEG: Universal Edge Gateway for AI Request Routing},
author={Ahmad Garba},
year={2026},
url={https://huggingface.co/rufatronics/ueg-classifier}
}
License
MIT