--- language: - en - ar - hi - fr - es - zh - sw - pt tags: - text-classification - intent-classification - routing - multilingual - onnx license: mit datasets: - rufatronics/ueg-training-data metrics: - accuracy - f1 model-index: - name: UEG Classifier v1 results: - task: type: text-classification dataset: name: UEG Training Data type: rufatronics/ueg-training-data metrics: - type: accuracy value: 0.9735 - type: f1 value: 0.9733 --- # UEG — Universal Edge Gateway Classifier A 35M parameter bidirectional transformer for intent classification and AI request routing. ## Model Description UEG classifies incoming user text into 22 intent classes across 5 routing tiers, plus a secondary 5-class language resource density classification. Both outputs come from a single forward pass. - **Architecture**: 6-layer bidirectional transformer encoder, 512 hidden dim, 8 attention heads - **Parameters**: ~35M - **Max sequence length**: 128 tokens - **Tokenizer**: Custom BPE trained on the UEG training corpus (32K vocab) - **Languages**: English, Arabic, Hindi, French, Spanish, Chinese, Swahili, Portuguese ## Performance | Head | Accuracy | Macro F1 | |------|----------|----------| | Intent (22 classes) | 97.35% | 0.9733 | | Resource Density (5 classes) | 99.95% | 0.9987 | ## Usage ### Via REST API (recommended) ```python import requests r = requests.post("https://ueg-api.onrender.com/classify", json={"text": "Write a Python function to sort a list"}) print(r.json()) ``` ### Via ONNX Runtime ```python import onnxruntime as ort import numpy as np from tokenizers import Tokenizer from huggingface_hub import hf_hub_download # Load tokenizer tok_path = hf_hub_download("rufatronics/ueg-classifier", "tokenizer/tokenizer.json") tokenizer = Tokenizer.from_file(tok_path) tokenizer.enable_padding(pad_id=0, pad_token="[PAD]", length=128) tokenizer.enable_truncation(max_length=128) # Load ONNX model + data file (both needed) onnx_path = hf_hub_download("rufatronics/ueg-classifier", "export/ueg_model.onnx") data_path = hf_hub_download("rufatronics/ueg-classifier", "export/ueg_model.onnx.data") sess = ort.InferenceSession(onnx_path, providers=["CPUExecutionProvider"]) # Inference enc = tokenizer.encode("Write a Python function to reverse a string") ids = np.array([enc.ids], dtype=np.int64) mask = np.array([enc.attention_mask], dtype=np.int64) logits_intent, logits_resource = sess.run(None, {"input_ids": ids, "attention_mask": mask}) intent_class = np.argmax(logits_intent) ``` ## Files | File | Description | |------|-------------| | `checkpoint_best.pt` | PyTorch weights (best validation epoch) | | `checkpoint_latest.pt` | PyTorch weights (final epoch) | | `export/ueg_model.onnx` | ONNX model for production inference | | `export/ueg_model.onnx.data` | ONNX external data (required alongside .onnx) | | `export/config.json` | Architecture hyperparameters | | `export/benchmark.json` | Inference latency benchmark | | `tokenizer/tokenizer.json` | Tokenizer definition | | `tokenizer/tokenizer_config.json` | Tokenizer config with pad/cls/sep IDs | | `labels/intent_classes.json` | Intent class label mappings | | `labels/resource_classes.json` | Resource density class mappings | ## Training Trained from scratch on 176K synthetic examples generated via the UEG data generation pipeline using Groq, Gemini, and Mistral free tiers. Three-phase training: warmup → cosine decay → head fine-tuning. Early stopping with patience=4. Full training code: https://github.com/rufatronics/ueg-datagen ## Citation ```bibtex @misc{ueg2026, title={UEG: Universal Edge Gateway for AI Request Routing}, author={Ahmad Garba}, year={2026}, url={https://huggingface.co/rufatronics/ueg-classifier} } ``` ## License MIT