Ippoboi's picture
Upload README.md with huggingface_hub
2544f82 verified
---
base_model:
- microsoft/Multilingual-MiniLM-L12-H384
license: apache-2.0
language:
- en
- fr
pipeline_tag: text-classification
inference: false
tags:
- classification
- emails
- onnx
- mobile
- int8
widget:
- text: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way and will arrive by Monday."
example_title: Transaction
- text: "Subject: Meeting tomorrow\n\nBody: Hey, can we reschedule our 2pm meeting to 3pm? Let me know."
example_title: Personal
- text: "Subject: Weekly Newsletter\n\nBody: Check out our latest deals! 50% off everything this weekend."
example_title: Newsletter
- text: "Subject: Security Alert\n\nBody: A new device logged into your account from San Francisco, CA."
example_title: Alert
---
# Email Classifier (MiniLM ONNX)
A dual-head MiniLM classifier for email category + action prediction, optimized for on-device inference using ONNX Runtime.
## Model Description
Classifies emails into 5 categories and predicts whether action is required:
| Category | Description |
|----------|-------------|
| **PERSONAL** | 1:1 human communication, social messages |
| **NEWSLETTER** | Marketing, promotions, subscribed content |
| **TRANSACTION** | Orders, receipts, payments, confirmations |
| **ALERT** | Security notices, important notifications |
| **SOCIAL** | Social network notifications, community updates |
### Output Format
Single forward pass producing two tensors:
- `category_probs`: Float32[5] β€” softmax probabilities per category (argmax = predicted category)
- `action_prob`: Float32[1] β€” sigmoid probability of action required (threshold 0.5)
No text generation, no decoder, no beam search.
**Example:**
```
Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..."
Output: category_probs β†’ TRANSACTION (0.94), action_prob β†’ 0.12 (NO_ACTION)
```
## Intended Use
- **Primary:** On-device email triage in mobile apps (iOS/Android)
- **Runtime:** ONNX Runtime React Native
- **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails
## Model Details
| Attribute | Value |
|-----------|-------|
| Base Model | `microsoft/Multilingual-MiniLM-L12-H384` |
| Parameters | ~117M |
| Architecture | XLM-R encoder + dual classification heads |
| ONNX Size | 113 MB (INT8 quantized) |
| Max Sequence | 256 tokens |
| Tokenizer | SentencePiece BPE (250K vocab) |
## Performance
| Metric | Score |
|--------|-------|
| Category Accuracy | 92.0% |
| Action Accuracy | 82.8% |
| Quantization | INT8 dynamic (4x compression) |
## Training Data
- **Source:** Personal Gmail inboxes (anonymized)
- **Languages:** English, French
- **Labeling:** Human-annotated with category + action flag
- **Input format:** `Subject: ...\n\nBody: ...` (no instruction prefix)
## How to Use
### ONNX Runtime (React Native)
```typescript
import { InferenceSession, Tensor } from 'onnxruntime-react-native';
const session = await InferenceSession.create('model.onnx');
const outputs = await session.run({
input_ids: inputIdsTensor,
attention_mask: attentionMaskTensor,
token_type_ids: tokenTypeIdsTensor, // all zeros
});
const categoryProbs = outputs.category_probs.data; // Float32[5]
const actionProb = outputs.action_prob.data[0]; // Float32
```
### Python (PyTorch)
```python
from transformers import AutoTokenizer, AutoModel
import torch
# Load the base encoder + trained heads
tokenizer = AutoTokenizer.from_pretrained("Ippoboi/minilmail-classifier")
# Load DualHeadClassifier from checkpoint (see train_classifier.py)
text = "Subject: Meeting tomorrow\n\nBody: Can we reschedule to 3pm?"
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True)
with torch.no_grad():
cat_logits, act_logits = model(inputs["input_ids"], inputs["attention_mask"])
category = ["ALERT", "NEWSLETTER", "PERSONAL", "SOCIAL", "TRANSACTION"][cat_logits.argmax()]
action = torch.sigmoid(act_logits).item() > 0.5
```
## Files
| File | Size | Description |
|------|------|-------------|
| `model.onnx` | 113 MB | INT8 quantized ONNX model |
| `tokenizer.json` | 17 MB | SentencePiece BPE tokenizer (XLM-R vocab) |
## Architecture
```
Input β†’ XLM-R Encoder (12 layers, 384 hidden) β†’ [CLS] token
↓
β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”
↓ ↓
Category Head Action Head
Linear(384β†’5) Linear(384β†’1)
↓ ↓
softmax sigmoid
```
## Compared to Previous Model (FLAN-T5)
| | FLAN-T5 (v2) | MiniLM (this) |
|---|---|---|
| Download size | 376 MB (3 files) | 130 MB (2 files) |
| Inference | Encoder + decoder beam search | Single forward pass |
| Output | Generated text | Probabilities |
| Summaries | Yes | No (uses Gmail snippets) |
| Latency | ~300ms+ (multiple decoder calls) | ~30-50ms (single call) |
## Limitations
- Trained primarily on English/French emails
- May not generalize well to enterprise/corporate email patterns
- Classification accuracy depends on email content quality (plain text preferred over HTML-heavy)
- 250K vocab tokenizer is oversized for this use case (XLM-R covers 100+ languages)
## License
Apache 2.0