franz-models / README.md
adlk's picture
Add model card from franz-email-classifier
dcfd2ab verified
---
license: mit
language:
- en
- de
- multilingual
library_name: transformers
pipeline_tag: text-classification
tags:
- email
- classification
- multi-label
- onnx
- int8
- priority
base_model: microsoft/Multilingual-MiniLM-L12-H384
model-index:
- name: franz-email-classifier
results: []
---
# Franz Email Classifier
Multi-label email classification model used by [Franz](https://meetfranz.com) to automatically prioritize emails.
Fine-tuned from [`microsoft/Multilingual-MiniLM-L12-H384`](https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384) and exported as **ONNX INT8** for fast CPU inference in Electron.
## Labels
The model predicts 8 binary labels per email:
| Label | Meaning |
|---|---|
| `IS_URGENT` | Needs attention today |
| `NEEDS_REPLY` | Direct question or action request to the user |
| `HAS_DEADLINE` | Explicit or relative deadline mentioned |
| `IS_ACTIONABLE` | Any action required (broader than NEEDS_REPLY) |
| `IS_INFORMATIONAL` | FYI / status update, no action needed |
| `IS_AUTOMATED` | Machine-generated (CI/CD, monitoring, alerts) |
| `IS_NEWSLETTER` | Content marketing / newsletter |
| `IS_TRANSACTIONAL` | Receipt, invoice, order confirmation |
## Priority Mapping
Labels are combined into priority tiers in the Franz app:
| Condition | Priority |
|---|---|
| IS_URGENT + NEEDS_REPLY | `urgent` |
| IS_URGENT | `important` |
| NEEDS_REPLY + IS_ACTIONABLE | `important` |
| IS_NEWSLETTER or IS_TRANSACTIONAL | `noise` |
| IS_AUTOMATED (not urgent) | `noise` |
| IS_INFORMATIONAL (not urgent/reply) | `low` |
| Everything else | `normal` |
## Usage
### With @huggingface/transformers (Node.js / Electron)
```ts
import { pipeline, env } from '@huggingface/transformers'
env.allowLocalModels = true
env.localModelPath = '/path/to/models' // parent dir
const classifier = await pipeline(
'text-classification',
'email-classifier', // subdirectory name
{ dtype: 'int8', device: 'cpu', multi_label: true }
)
const result = await classifier('Re: Urgent: Invoice #4521 due Friday')
// [
// { label: 'IS_URGENT', score: 0.94 },
// { label: 'NEEDS_REPLY', score: 0.12 },
// { label: 'HAS_DEADLINE', score: 0.91 },
// ...
// ]
```
### With transformers (Python)
```python
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="meetfranz/franz-models",
top_k=None
)
result = classifier("Re: Urgent: Invoice #4521 due Friday")
```
## Model Details
| Property | Value |
|---|---|
| Architecture | BertForSequenceClassification |
| Base model | microsoft/Multilingual-MiniLM-L12-H384 |
| Hidden size | 384 |
| Layers | 12 |
| Attention heads | 12 |
| Max sequence length | 512 |
| Vocab size | 250,037 |
| Tokenizer | XLMRobertaTokenizer (SentencePiece BPE) |
| Problem type | Multi-label classification |
| Quantization | ONNX INT8 |
| Model size | ~113 MB (quantized) |
## Training
Trained on LLM-generated synthetic email data. No real user emails or personal data were used in training. Labels were bootstrapped via LLM annotation and human-reviewed for quality.
Fine-tuned with multi-label BCE loss, then exported to ONNX with INT8 dynamic quantization.
## How Franz Uses This Model
This model is **Stage 2** in Franz's three-stage email classification funnel:
1. **Stage 1 — Heuristics**: Fast rules-based classification for obvious cases
2. **Stage 2 — ML (this model)**: ONNX inference for ambiguous emails (confidence threshold: 0.75)
3. **Stage 3 — LLM**: Local or cloud LLM for emails below the ML confidence threshold
The model is downloaded on demand when a user first adds an email account to Franz. If unavailable, the app gracefully falls through to Stage 3.
## License
MIT