| --- |
| base_model: |
| - microsoft/Multilingual-MiniLM-L12-H384 |
| license: apache-2.0 |
| language: |
| - en |
| - fr |
| pipeline_tag: text-classification |
| inference: false |
| tags: |
| - classification |
| - emails |
| - onnx |
| - mobile |
| - int8 |
| widget: |
| - text: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way and will arrive by Monday." |
| example_title: Transaction |
| - text: "Subject: Meeting tomorrow\n\nBody: Hey, can we reschedule our 2pm meeting to 3pm? Let me know." |
| example_title: Personal |
| - text: "Subject: Weekly Newsletter\n\nBody: Check out our latest deals! 50% off everything this weekend." |
| example_title: Newsletter |
| - text: "Subject: Security Alert\n\nBody: A new device logged into your account from San Francisco, CA." |
| example_title: Alert |
| --- |
| |
| # Email Classifier (MiniLM ONNX) |
|
|
| A dual-head MiniLM classifier for email category + action prediction, optimized for on-device inference using ONNX Runtime. |
|
|
| ## Model Description |
|
|
| Classifies emails into 5 categories and predicts whether action is required: |
|
|
| | Category | Description | |
| |----------|-------------| |
| | **PERSONAL** | 1:1 human communication, social messages | |
| | **NEWSLETTER** | Marketing, promotions, subscribed content | |
| | **TRANSACTION** | Orders, receipts, payments, confirmations | |
| | **ALERT** | Security notices, important notifications | |
| | **SOCIAL** | Social network notifications, community updates | |
|
|
| ### Output Format |
|
|
| Single forward pass producing two tensors: |
| - `category_probs`: Float32[5] β softmax probabilities per category (argmax = predicted category) |
| - `action_prob`: Float32[1] β sigmoid probability of action required (threshold 0.5) |
|
|
| No text generation, no decoder, no beam search. |
|
|
| **Example:** |
|
|
| ``` |
| Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..." |
| Output: category_probs β TRANSACTION (0.94), action_prob β 0.12 (NO_ACTION) |
| ``` |
|
|
| ## Intended Use |
|
|
| - **Primary:** On-device email triage in mobile apps (iOS/Android) |
| - **Runtime:** ONNX Runtime React Native |
| - **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails |
|
|
| ## Model Details |
|
|
| | Attribute | Value | |
| |-----------|-------| |
| | Base Model | `microsoft/Multilingual-MiniLM-L12-H384` | |
| | Parameters | ~117M | |
| | Architecture | XLM-R encoder + dual classification heads | |
| | ONNX Size | 113 MB (INT8 quantized) | |
| | Max Sequence | 256 tokens | |
| | Tokenizer | SentencePiece BPE (250K vocab) | |
|
|
| ## Performance |
|
|
| | Metric | Score | |
| |--------|-------| |
| | Category Accuracy | 92.0% | |
| | Action Accuracy | 82.8% | |
| | Quantization | INT8 dynamic (4x compression) | |
|
|
| ## Training Data |
|
|
| - **Source:** Personal Gmail inboxes (anonymized) |
| - **Languages:** English, French |
| - **Labeling:** Human-annotated with category + action flag |
| - **Input format:** `Subject: ...\n\nBody: ...` (no instruction prefix) |
|
|
| ## How to Use |
|
|
| ### ONNX Runtime (React Native) |
|
|
| ```typescript |
| import { InferenceSession, Tensor } from 'onnxruntime-react-native'; |
| |
| const session = await InferenceSession.create('model.onnx'); |
| |
| const outputs = await session.run({ |
| input_ids: inputIdsTensor, |
| attention_mask: attentionMaskTensor, |
| token_type_ids: tokenTypeIdsTensor, // all zeros |
| }); |
| |
| const categoryProbs = outputs.category_probs.data; // Float32[5] |
| const actionProb = outputs.action_prob.data[0]; // Float32 |
| ``` |
|
|
| ### Python (PyTorch) |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModel |
| import torch |
| |
| # Load the base encoder + trained heads |
| tokenizer = AutoTokenizer.from_pretrained("Ippoboi/minilmail-classifier") |
| # Load DualHeadClassifier from checkpoint (see train_classifier.py) |
| |
| text = "Subject: Meeting tomorrow\n\nBody: Can we reschedule to 3pm?" |
| inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True) |
| |
| with torch.no_grad(): |
| cat_logits, act_logits = model(inputs["input_ids"], inputs["attention_mask"]) |
| category = ["ALERT", "NEWSLETTER", "PERSONAL", "SOCIAL", "TRANSACTION"][cat_logits.argmax()] |
| action = torch.sigmoid(act_logits).item() > 0.5 |
| ``` |
|
|
| ## Files |
|
|
| | File | Size | Description | |
| |------|------|-------------| |
| | `model.onnx` | 113 MB | INT8 quantized ONNX model | |
| | `tokenizer.json` | 17 MB | SentencePiece BPE tokenizer (XLM-R vocab) | |
|
|
| ## Architecture |
|
|
| ``` |
| Input β XLM-R Encoder (12 layers, 384 hidden) β [CLS] token |
| β |
| βββββββ΄ββββββ |
| β β |
| Category Head Action Head |
| Linear(384β5) Linear(384β1) |
| β β |
| softmax sigmoid |
| ``` |
|
|
| ## Compared to Previous Model (FLAN-T5) |
|
|
| | | FLAN-T5 (v2) | MiniLM (this) | |
| |---|---|---| |
| | Download size | 376 MB (3 files) | 130 MB (2 files) | |
| | Inference | Encoder + decoder beam search | Single forward pass | |
| | Output | Generated text | Probabilities | |
| | Summaries | Yes | No (uses Gmail snippets) | |
| | Latency | ~300ms+ (multiple decoder calls) | ~30-50ms (single call) | |
|
|
| ## Limitations |
|
|
| - Trained primarily on English/French emails |
| - May not generalize well to enterprise/corporate email patterns |
| - Classification accuracy depends on email content quality (plain text preferred over HTML-heavy) |
| - 250K vocab tokenizer is oversized for this use case (XLM-R covers 100+ languages) |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|