---
base_model:
- microsoft/Multilingual-MiniLM-L12-H384
license: apache-2.0
language:
- en
- fr
pipeline_tag: text-classification
inference: false
tags:
- classification
- emails
- onnx
- mobile
- int8
widget:
- text: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way and will arrive by Monday."
  example_title: Transaction
- text: "Subject: Meeting tomorrow\n\nBody: Hey, can we reschedule our 2pm meeting to 3pm? Let me know."
  example_title: Personal
- text: "Subject: Weekly Newsletter\n\nBody: Check out our latest deals! 50% off everything this weekend."
  example_title: Newsletter
- text: "Subject: Security Alert\n\nBody: A new device logged into your account from San Francisco, CA."
  example_title: Alert
---

# Email Classifier (MiniLM ONNX)

A dual-head MiniLM classifier for email category + action prediction, optimized for on-device inference using ONNX Runtime.

## Model Description

Classifies emails into 5 categories and predicts whether action is required:

| Category | Description |
|----------|-------------|
| **PERSONAL** | 1:1 human communication, social messages |
| **NEWSLETTER** | Marketing, promotions, subscribed content |
| **TRANSACTION** | Orders, receipts, payments, confirmations |
| **ALERT** | Security notices, important notifications |
| **SOCIAL** | Social network notifications, community updates |

### Output Format

Single forward pass producing two tensors:
- `category_probs`: Float32[5] — softmax probabilities per category (argmax = predicted category)
- `action_prob`: Float32[1] — sigmoid probability of action required (threshold 0.5)

No text generation, no decoder, no beam search.

**Example:**

```
Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..."
Output: category_probs → TRANSACTION (0.94), action_prob → 0.12 (NO_ACTION)
```

## Intended Use

- **Primary:** On-device email triage in mobile apps (iOS/Android)
- **Runtime:** ONNX Runtime React Native
- **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails

## Model Details

| Attribute | Value |
|-----------|-------|
| Base Model | `microsoft/Multilingual-MiniLM-L12-H384` |
| Parameters | ~117M |
| Architecture | XLM-R encoder + dual classification heads |
| ONNX Size | 113 MB (INT8 quantized) |
| Max Sequence | 256 tokens |
| Tokenizer | SentencePiece BPE (250K vocab) |

## Performance

| Metric | Score |
|--------|-------|
| Category Accuracy | 92.0% |
| Action Accuracy | 82.8% |
| Quantization | INT8 dynamic (4x compression) |

## Training Data

- **Source:** Personal Gmail inboxes (anonymized)
- **Languages:** English, French
- **Labeling:** Human-annotated with category + action flag
- **Input format:** `Subject: ...\n\nBody: ...` (no instruction prefix)

## How to Use

### ONNX Runtime (React Native)

```typescript
import { InferenceSession, Tensor } from 'onnxruntime-react-native';

const session = await InferenceSession.create('model.onnx');

const outputs = await session.run({
  input_ids: inputIdsTensor,
  attention_mask: attentionMaskTensor,
  token_type_ids: tokenTypeIdsTensor,  // all zeros
});

const categoryProbs = outputs.category_probs.data;  // Float32[5]
const actionProb = outputs.action_prob.data[0];      // Float32
```

### Python (PyTorch)

```python
from transformers import AutoTokenizer, AutoModel
import torch

# Load the base encoder + trained heads
tokenizer = AutoTokenizer.from_pretrained("Ippoboi/minilmail-classifier")
# Load DualHeadClassifier from checkpoint (see train_classifier.py)

text = "Subject: Meeting tomorrow\n\nBody: Can we reschedule to 3pm?"
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True)

with torch.no_grad():
    cat_logits, act_logits = model(inputs["input_ids"], inputs["attention_mask"])
    category = ["ALERT", "NEWSLETTER", "PERSONAL", "SOCIAL", "TRANSACTION"][cat_logits.argmax()]
    action = torch.sigmoid(act_logits).item() > 0.5
```

## Files

| File | Size | Description |
|------|------|-------------|
| `model.onnx` | 113 MB | INT8 quantized ONNX model |
| `tokenizer.json` | 17 MB | SentencePiece BPE tokenizer (XLM-R vocab) |

## Architecture

```
Input → XLM-R Encoder (12 layers, 384 hidden) → [CLS] token
                                                    ↓
                                              ┌─────┴─────┐
                                              ↓           ↓
                                        Category Head  Action Head
                                        Linear(384→5)  Linear(384→1)
                                              ↓           ↓
                                          softmax      sigmoid
```

## Compared to Previous Model (FLAN-T5)

| | FLAN-T5 (v2) | MiniLM (this) |
|---|---|---|
| Download size | 376 MB (3 files) | 130 MB (2 files) |
| Inference | Encoder + decoder beam search | Single forward pass |
| Output | Generated text | Probabilities |
| Summaries | Yes | No (uses Gmail snippets) |
| Latency | ~300ms+ (multiple decoder calls) | ~30-50ms (single call) |

## Limitations

- Trained primarily on English/French emails
- May not generalize well to enterprise/corporate email patterns
- Classification accuracy depends on email content quality (plain text preferred over HTML-heavy)
- 250K vocab tokenizer is oversized for this use case (XLM-R covers 100+ languages)

## License

Apache 2.0