Ippoboi commited on
Commit
cc6a49d
·
verified ·
1 Parent(s): 1f379f7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -1
README.md CHANGED
@@ -11,4 +11,104 @@ tags:
11
  - text2text-generation
12
  - onnx
13
  - mobile
14
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  - text2text-generation
12
  - onnx
13
  - mobile
14
+ ---
15
+
16
+ # Gmail Email Classifier (FLAN-T5 ONNX)
17
+
18
+ A fine-tuned FLAN-T5-small model for email classification, optimized for on-device inference in mobile apps using ONNX Runtime.
19
+
20
+ ## Model Description
21
+
22
+ This model classifies emails into 5 categories and determines if action is required:
23
+
24
+ | Category | Description |
25
+ |----------|-------------|
26
+ | **PERSONAL** | 1:1 human communication, social messages |
27
+ | **NEWSLETTER** | Marketing, promotions, subscribed content |
28
+ | **TRANSACTION** | Orders, receipts, payments, confirmations |
29
+ | **ALERT** | Security notices, important notifications |
30
+ | **SOCIAL** | Social network notifications, community updates |
31
+
32
+ ### Output Format
33
+
34
+ ```
35
+ CATEGORY | ACTION/NO_ACTION | Brief summary
36
+ ```
37
+
38
+ **Example:**
39
+
40
+ ```
41
+ Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..."
42
+ Output: "TRANSACTION | NO_ACTION | Order shipment confirmation for #12345"
43
+ ```
44
+
45
+ ## Intended Use
46
+
47
+ - **Primary:** On-device email triage in mobile apps (iOS/Android)
48
+ - **Runtime:** ONNX Runtime React Native
49
+ - **Use case:** Prioritizing inbox, filtering noise, surfacing actionable emails
50
+
51
+ ## Model Details
52
+
53
+ | Attribute | Value |
54
+ |-----------|-------|
55
+ | Base Model | `google/flan-t5-small` |
56
+ | Parameters | ~80M |
57
+ | Architecture | T5 Encoder-Decoder |
58
+ | ONNX Size | 357 MB (encoder: 141 MB, decoder: 232 MB) |
59
+ | Latency | ~79ms (iPhone, CPU) |
60
+ | Max Sequence | 512 tokens |
61
+
62
+ ## Training Data
63
+
64
+ - **Size:** 2,043 training / 256 validation / 255 test examples
65
+ - **Source:** Personal Gmail inboxes (anonymized)
66
+ - **Languages:** English, French
67
+ - **Labeling:** Human-annotated with category + action flag
68
+
69
+ ## How to Use
70
+
71
+ ### ONNX Runtime (React Native)
72
+
73
+ ```typescript
74
+ import { InferenceSession } from 'onnxruntime-react-native';
75
+
76
+ const encoder = await InferenceSession.create('encoder_model.onnx');
77
+ const decoder = await InferenceSession.create('decoder_model.onnx');
78
+
79
+ // Tokenize input, run encoder, greedy decode
80
+ ```
81
+
82
+ ### Python (Transformers)
83
+
84
+ ```python
85
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
86
+
87
+ model = T5ForConditionalGeneration.from_pretrained("ippoboi/gmail-classifier")
88
+ tokenizer = T5Tokenizer.from_pretrained("ippoboi/gmail-classifier")
89
+
90
+ input_text = "Classify this email: Subject: Meeting tomorrow\n\nBody: Can we reschedule?"
91
+ inputs = tokenizer(input_text, return_tensors="pt")
92
+ outputs = model.generate(**inputs)
93
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
94
+ # Output: "PERSONAL | ACTION | Request to reschedule meeting"
95
+ ```
96
+
97
+ ## Files
98
+
99
+ | File | Size | Description |
100
+ |------|------|-------------|
101
+ | `encoder_model.onnx` | 141 MB | ONNX encoder |
102
+ | `decoder_model.onnx` | 232 MB | ONNX decoder |
103
+ | `tokenizer.json` | 2.4 MB | SentencePiece tokenizer |
104
+ | `config.json` | 2 KB | Model configuration |
105
+
106
+ ## Limitations
107
+
108
+ - Trained primarily on English/French emails
109
+ - May not generalize well to enterprise/corporate email patterns
110
+ - Classification accuracy depends on email content quality (plain text preferred over HTML-heavy)
111
+
112
+ ## License
113
+
114
+ Apache 2.0