ueg-classifier / README.md
rufatronics's picture
Correctly format developer attribution in README.md
9697de9 verified
|
Raw
History Blame Contribute Delete
4 kB
---
language:
- en
- ar
- hi
- fr
- es
- zh
- sw
- pt
tags:
- text-classification
- intent-classification
- routing
- multilingual
- onnx
license: mit
datasets:
- rufatronics/ueg-training-data
metrics:
- accuracy
- f1
model-index:
- name: UEG Classifier v1
results:
- task:
type: text-classification
dataset:
name: UEG Training Data
type: rufatronics/ueg-training-data
metrics:
- type: accuracy
value: 0.9735
- type: f1
value: 0.9733
---
Developed by rufatronics (Aga) <br/>Ahmad Garba Adamu
# UEG — Universal Edge Gateway Classifier
A 35M parameter bidirectional transformer for intent classification and AI request routing.
## Model Description
UEG classifies incoming user text into 22 intent classes across 5 routing tiers, plus a secondary 5-class language resource density classification. Both outputs come from a single forward pass.
- **Architecture**: 6-layer bidirectional transformer encoder, 512 hidden dim, 8 attention heads
- **Parameters**: ~35M
- **Max sequence length**: 128 tokens
- **Tokenizer**: Custom BPE trained on the UEG training corpus (32K vocab)
- **Languages**: English, Arabic, Hindi, French, Spanish, Chinese, Swahili, Portuguese
## Performance
| Head | Accuracy | Macro F1 |
|------|----------|----------|
| Intent (22 classes) | 97.35% | 0.9733 |
| Resource Density (5 classes) | 99.95% | 0.9987 |
## Usage
### Via REST API (recommended)
```python
import requests
r = requests.post("https://ueg-api.onrender.com/classify",
json={"text": "Write a Python function to sort a list"})
print(r.json())
```
### Via ONNX Runtime
```python
import onnxruntime as ort
import numpy as np
from tokenizers import Tokenizer
from huggingface_hub import hf_hub_download
# Load tokenizer
tok_path = hf_hub_download("rufatronics/ueg-classifier",
"tokenizer/tokenizer.json")
tokenizer = Tokenizer.from_file(tok_path)
tokenizer.enable_padding(pad_id=0, pad_token="[PAD]", length=128)
tokenizer.enable_truncation(max_length=128)
# Load ONNX model + data file (both needed)
onnx_path = hf_hub_download("rufatronics/ueg-classifier",
"export/ueg_model.onnx")
data_path = hf_hub_download("rufatronics/ueg-classifier",
"export/ueg_model.onnx.data")
sess = ort.InferenceSession(onnx_path,
providers=["CPUExecutionProvider"])
# Inference
enc = tokenizer.encode("Write a Python function to reverse a string")
ids = np.array([enc.ids], dtype=np.int64)
mask = np.array([enc.attention_mask], dtype=np.int64)
logits_intent, logits_resource = sess.run(None,
{"input_ids": ids, "attention_mask": mask})
intent_class = np.argmax(logits_intent)
```
## Files
| File | Description |
|------|-------------|
| `checkpoint_best.pt` | PyTorch weights (best validation epoch) |
| `checkpoint_latest.pt` | PyTorch weights (final epoch) |
| `export/ueg_model.onnx` | ONNX model for production inference |
| `export/ueg_model.onnx.data` | ONNX external data (required alongside .onnx) |
| `export/config.json` | Architecture hyperparameters |
| `export/benchmark.json` | Inference latency benchmark |
| `tokenizer/tokenizer.json` | Tokenizer definition |
| `tokenizer/tokenizer_config.json` | Tokenizer config with pad/cls/sep IDs |
| `labels/intent_classes.json` | Intent class label mappings |
| `labels/resource_classes.json` | Resource density class mappings |
## Training
Trained from scratch on 176K synthetic examples generated via the UEG data generation pipeline using Groq, Gemini, and Mistral free tiers. Three-phase training: warmup → cosine decay → head fine-tuning. Early stopping with patience=4.
Full training code: https://github.com/rufatronics/ueg-datagen
## Citation
```bibtex
@misc{ueg2026,
title={UEG: Universal Edge Gateway for AI Request Routing},
author={Ahmad Garba},
year={2026},
url={https://huggingface.co/rufatronics/ueg-classifier}
}
```
## License
MIT