ueg-classifier / README.md
rufatronics's picture
Upload README.md with huggingface_hub
858e75a verified
|
Raw
History Blame Contribute Delete
3.95 kB
metadata
language:
  - en
  - ar
  - hi
  - fr
  - es
  - zh
  - sw
  - pt
tags:
  - text-classification
  - intent-classification
  - routing
  - multilingual
  - onnx
license: mit
datasets:
  - rufatronics/ueg-training-data
metrics:
  - accuracy
  - f1
model-index:
  - name: UEG Classifier v1
    results:
      - task:
          type: text-classification
        dataset:
          name: UEG Training Data
          type: rufatronics/ueg-training-data
        metrics:
          - type: accuracy
            value: 0.9735
          - type: f1
            value: 0.9733

UEG — Universal Edge Gateway Classifier

A 35M parameter bidirectional transformer for intent classification and AI request routing.

Model Description

UEG classifies incoming user text into 22 intent classes across 5 routing tiers, plus a secondary 5-class language resource density classification. Both outputs come from a single forward pass.

  • Architecture: 6-layer bidirectional transformer encoder, 512 hidden dim, 8 attention heads
  • Parameters: ~35M
  • Max sequence length: 128 tokens
  • Tokenizer: Custom BPE trained on the UEG training corpus (32K vocab)
  • Languages: English, Arabic, Hindi, French, Spanish, Chinese, Swahili, Portuguese

Performance

Head Accuracy Macro F1
Intent (22 classes) 97.35% 0.9733
Resource Density (5 classes) 99.95% 0.9987

Usage

Via REST API (recommended)

import requests

r = requests.post("https://ueg-api.onrender.com/classify",
                  json={"text": "Write a Python function to sort a list"})
print(r.json())

Via ONNX Runtime

import onnxruntime as ort
import numpy as np
from tokenizers import Tokenizer
from huggingface_hub import hf_hub_download

# Load tokenizer
tok_path = hf_hub_download("rufatronics/ueg-classifier",
                            "tokenizer/tokenizer.json")
tokenizer = Tokenizer.from_file(tok_path)
tokenizer.enable_padding(pad_id=0, pad_token="[PAD]", length=128)
tokenizer.enable_truncation(max_length=128)

# Load ONNX model + data file (both needed)
onnx_path = hf_hub_download("rufatronics/ueg-classifier",
                             "export/ueg_model.onnx")
data_path = hf_hub_download("rufatronics/ueg-classifier",
                             "export/ueg_model.onnx.data")

sess = ort.InferenceSession(onnx_path,
                             providers=["CPUExecutionProvider"])

# Inference
enc  = tokenizer.encode("Write a Python function to reverse a string")
ids  = np.array([enc.ids], dtype=np.int64)
mask = np.array([enc.attention_mask], dtype=np.int64)

logits_intent, logits_resource = sess.run(None,
    {"input_ids": ids, "attention_mask": mask})

intent_class = np.argmax(logits_intent)

Files

File Description
checkpoint_best.pt PyTorch weights (best validation epoch)
checkpoint_latest.pt PyTorch weights (final epoch)
export/ueg_model.onnx ONNX model for production inference
export/ueg_model.onnx.data ONNX external data (required alongside .onnx)
export/config.json Architecture hyperparameters
export/benchmark.json Inference latency benchmark
tokenizer/tokenizer.json Tokenizer definition
tokenizer/tokenizer_config.json Tokenizer config with pad/cls/sep IDs
labels/intent_classes.json Intent class label mappings
labels/resource_classes.json Resource density class mappings

Training

Trained from scratch on 176K synthetic examples generated via the UEG data generation pipeline using Groq, Gemini, and Mistral free tiers. Three-phase training: warmup → cosine decay → head fine-tuning. Early stopping with patience=4.

Full training code: https://github.com/rufatronics/ueg-datagen

Citation

@misc{ueg2026,
  title={UEG: Universal Edge Gateway for AI Request Routing},
  author={Ahmad Garba},
  year={2026},
  url={https://huggingface.co/rufatronics/ueg-classifier}
}

License

MIT