--- base_model: - microsoft/Multilingual-MiniLM-L12-H384 license: apache-2.0 language: - en - fr pipeline_tag: text-classification inference: false tags: - classification - emails - onnx - mobile - int8 widget: - text: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way and will arrive by Monday." example_title: Transaction - text: "Subject: Meeting tomorrow\n\nBody: Hey, can we reschedule our 2pm meeting to 3pm? Let me know." example_title: Personal - text: "Subject: Weekly Newsletter\n\nBody: Check out our latest deals! 50% off everything this weekend." example_title: Newsletter - text: "Subject: Security Alert\n\nBody: A new device logged into your account from San Francisco, CA." example_title: Alert --- # Email Classifier (MiniLM ONNX) A dual-head MiniLM classifier for email category + action prediction, optimized for on-device inference using ONNX Runtime. ## Model Description Classifies emails into 5 categories and predicts whether action is required: | Category | Description | |----------|-------------| | **PERSONAL** | 1:1 human communication, social messages | | **NEWSLETTER** | Marketing, promotions, subscribed content | | **TRANSACTION** | Orders, receipts, payments, confirmations | | **ALERT** | Security notices, important notifications | | **SOCIAL** | Social network notifications, community updates | ### Output Format Single forward pass producing two tensors: - `category_probs`: Float32[5] — softmax probabilities per category (argmax = predicted category) - `action_prob`: Float32[1] — sigmoid probability of action required (threshold 0.5) No text generation, no decoder, no beam search. **Example:** ``` Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..." Output: category_probs → TRANSACTION (0.94), action_prob → 0.12 (NO_ACTION) ``` ## Intended Use - **Primary:** On-device email triage in mobile apps (iOS/Android) - **Runtime:** ONNX Runtime React Native - **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails ## Model Details | Attribute | Value | |-----------|-------| | Base Model | `microsoft/Multilingual-MiniLM-L12-H384` | | Parameters | ~117M | | Architecture | XLM-R encoder + dual classification heads | | ONNX Size | 113 MB (INT8 quantized) | | Max Sequence | 256 tokens | | Tokenizer | SentencePiece BPE (250K vocab) | ## Performance | Metric | Score | |--------|-------| | Category Accuracy | 92.0% | | Action Accuracy | 82.8% | | Quantization | INT8 dynamic (4x compression) | ## Training Data - **Source:** Personal Gmail inboxes (anonymized) - **Languages:** English, French - **Labeling:** Human-annotated with category + action flag - **Input format:** `Subject: ...\n\nBody: ...` (no instruction prefix) ## How to Use ### ONNX Runtime (React Native) ```typescript import { InferenceSession, Tensor } from 'onnxruntime-react-native'; const session = await InferenceSession.create('model.onnx'); const outputs = await session.run({ input_ids: inputIdsTensor, attention_mask: attentionMaskTensor, token_type_ids: tokenTypeIdsTensor, // all zeros }); const categoryProbs = outputs.category_probs.data; // Float32[5] const actionProb = outputs.action_prob.data[0]; // Float32 ``` ### Python (PyTorch) ```python from transformers import AutoTokenizer, AutoModel import torch # Load the base encoder + trained heads tokenizer = AutoTokenizer.from_pretrained("Ippoboi/minilmail-classifier") # Load DualHeadClassifier from checkpoint (see train_classifier.py) text = "Subject: Meeting tomorrow\n\nBody: Can we reschedule to 3pm?" inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True) with torch.no_grad(): cat_logits, act_logits = model(inputs["input_ids"], inputs["attention_mask"]) category = ["ALERT", "NEWSLETTER", "PERSONAL", "SOCIAL", "TRANSACTION"][cat_logits.argmax()] action = torch.sigmoid(act_logits).item() > 0.5 ``` ## Files | File | Size | Description | |------|------|-------------| | `model.onnx` | 113 MB | INT8 quantized ONNX model | | `tokenizer.json` | 17 MB | SentencePiece BPE tokenizer (XLM-R vocab) | ## Architecture ``` Input → XLM-R Encoder (12 layers, 384 hidden) → [CLS] token ↓ ┌─────┴─────┐ ↓ ↓ Category Head Action Head Linear(384→5) Linear(384→1) ↓ ↓ softmax sigmoid ``` ## Compared to Previous Model (FLAN-T5) | | FLAN-T5 (v2) | MiniLM (this) | |---|---|---| | Download size | 376 MB (3 files) | 130 MB (2 files) | | Inference | Encoder + decoder beam search | Single forward pass | | Output | Generated text | Probabilities | | Summaries | Yes | No (uses Gmail snippets) | | Latency | ~300ms+ (multiple decoder calls) | ~30-50ms (single call) | ## Limitations - Trained primarily on English/French emails - May not generalize well to enterprise/corporate email patterns - Classification accuracy depends on email content quality (plain text preferred over HTML-heavy) - 250K vocab tokenizer is oversized for this use case (XLM-R covers 100+ languages) ## License Apache 2.0