--- base_model: - google/flan-t5-small license: apache-2.0 language: - en - fr tags: - classification - emails - text2text-generation - onnx - mobile --- # Gmail Email Classifier (FLAN-T5 ONNX) A fine-tuned FLAN-T5-small model for email classification, optimized for on-device inference in mobile apps using ONNX Runtime. ## Model Description This model classifies emails into 5 categories and determines if action is required: | Category | Description | |----------|-------------| | **PERSONAL** | 1:1 human communication, social messages | | **NEWSLETTER** | Marketing, promotions, subscribed content | | **TRANSACTION** | Orders, receipts, payments, confirmations | | **ALERT** | Security notices, important notifications | | **SOCIAL** | Social network notifications, community updates | ### Output Format ``` CATEGORY | ACTION/NO_ACTION | Brief summary ``` **Example:** ``` Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..." Output: "TRANSACTION | NO_ACTION | Order shipment confirmation for #12345" ``` ## Intended Use - **Primary:** On-device email triage in mobile apps (iOS/Android) - **Runtime:** ONNX Runtime React Native - **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails ## Model Details | Attribute | Value | |-----------|-------| | Base Model | `google/flan-t5-small` | | Parameters | ~80M | | Architecture | T5 Encoder-Decoder | | ONNX Size | 357 MB (encoder: 141 MB, decoder: 232 MB) | | Latency | ~79ms (iPhone, CPU) | | Max Sequence | 512 tokens | ## Training Data - **Size:** 2,043 training / 256 validation / 255 test examples - **Source:** Personal Gmail inboxes (anonymized) - **Languages:** English, French - **Labeling:** Human-annotated with category + action flag ## How to Use ### ONNX Runtime (React Native) ```typescript import { InferenceSession } from 'onnxruntime-react-native'; const encoder = await InferenceSession.create('encoder_model.onnx'); const decoder = await InferenceSession.create('decoder_model.onnx'); // Tokenize input, run encoder, greedy decode ``` ### Python (Transformers) ```python from transformers import T5ForConditionalGeneration, T5Tokenizer model = T5ForConditionalGeneration.from_pretrained("ippoboi/gmail-classifier") tokenizer = T5Tokenizer.from_pretrained("ippoboi/gmail-classifier") input_text = "Classify this email: Subject: Meeting tomorrow\n\nBody: Can we reschedule?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) # Output: "PERSONAL | ACTION | Request to reschedule meeting" ``` ## Files | File | Size | Description | |------|------|-------------| | `encoder_model.onnx` | 141 MB | ONNX encoder | | `decoder_model.onnx` | 232 MB | ONNX decoder | | `tokenizer.json` | 2.4 MB | SentencePiece tokenizer | | `config.json` | 2 KB | Model configuration | ## Limitations - Trained primarily on English/French emails - May not generalize well to enterprise/corporate email patterns - Classification accuracy depends on email content quality (plain text preferred over HTML-heavy) ## License Apache 2.0