| | --- |
| | base_model: |
| | - google/flan-t5-small |
| | license: apache-2.0 |
| | language: |
| | - en |
| | - fr |
| | tags: |
| | - classification |
| | - emails |
| | - text2text-generation |
| | - onnx |
| | - mobile |
| | --- |
| | |
| | # Gmail Email Classifier (FLAN-T5 ONNX) |
| |
|
| | A fine-tuned FLAN-T5-small model for email classification, optimized for on-device inference in mobile apps using ONNX Runtime. |
| |
|
| | ## Model Description |
| |
|
| | This model classifies emails into 5 categories and determines if action is required: |
| |
|
| | | Category | Description | |
| | |----------|-------------| |
| | | **PERSONAL** | 1:1 human communication, social messages | |
| | | **NEWSLETTER** | Marketing, promotions, subscribed content | |
| | | **TRANSACTION** | Orders, receipts, payments, confirmations | |
| | | **ALERT** | Security notices, important notifications | |
| | | **SOCIAL** | Social network notifications, community updates | |
| |
|
| | ### Output Format |
| |
|
| | ``` |
| | CATEGORY | ACTION/NO_ACTION | Brief summary |
| | ``` |
| |
|
| | **Example:** |
| |
|
| | ``` |
| | Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..." |
| | Output: "TRANSACTION | NO_ACTION | Order shipment confirmation for #12345" |
| | ``` |
| |
|
| | ## Intended Use |
| |
|
| | - **Primary:** On-device email triage in mobile apps (iOS/Android) |
| | - **Runtime:** ONNX Runtime React Native |
| | - **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails |
| |
|
| | ## Model Details |
| |
|
| | | Attribute | Value | |
| | |-----------|-------| |
| | | Base Model | `google/flan-t5-small` | |
| | | Parameters | ~80M | |
| | | Architecture | T5 Encoder-Decoder | |
| | | ONNX Size | 357 MB (encoder: 141 MB, decoder: 232 MB) | |
| | | Latency | ~79ms (iPhone, CPU) | |
| | | Max Sequence | 512 tokens | |
| |
|
| | ## Training Data |
| |
|
| | - **Size:** 2,043 training / 256 validation / 255 test examples |
| | - **Source:** Personal Gmail inboxes (anonymized) |
| | - **Languages:** English, French |
| | - **Labeling:** Human-annotated with category + action flag |
| |
|
| | ## How to Use |
| |
|
| | ### ONNX Runtime (React Native) |
| |
|
| | ```typescript |
| | import { InferenceSession } from 'onnxruntime-react-native'; |
| | |
| | const encoder = await InferenceSession.create('encoder_model.onnx'); |
| | const decoder = await InferenceSession.create('decoder_model.onnx'); |
| | |
| | // Tokenize input, run encoder, greedy decode |
| | ``` |
| |
|
| | ### Python (Transformers) |
| |
|
| | ```python |
| | from transformers import T5ForConditionalGeneration, T5Tokenizer |
| | |
| | model = T5ForConditionalGeneration.from_pretrained("ippoboi/gmail-classifier") |
| | tokenizer = T5Tokenizer.from_pretrained("ippoboi/gmail-classifier") |
| | |
| | input_text = "Classify this email: Subject: Meeting tomorrow\n\nBody: Can we reschedule?" |
| | inputs = tokenizer(input_text, return_tensors="pt") |
| | outputs = model.generate(**inputs) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | # Output: "PERSONAL | ACTION | Request to reschedule meeting" |
| | ``` |
| |
|
| | ## Files |
| |
|
| | | File | Size | Description | |
| | |------|------|-------------| |
| | | `encoder_model.onnx` | 141 MB | ONNX encoder | |
| | | `decoder_model.onnx` | 232 MB | ONNX decoder | |
| | | `tokenizer.json` | 2.4 MB | SentencePiece tokenizer | |
| | | `config.json` | 2 KB | Model configuration | |
| |
|
| | ## Limitations |
| |
|
| | - Trained primarily on English/French emails |
| | - May not generalize well to enterprise/corporate email patterns |
| | - Classification accuracy depends on email content quality (plain text preferred over HTML-heavy) |
| |
|
| | ## License |
| |
|
| | Apache 2.0 |