Ippoboi
/

gmail-classifier

text2text-generation

Model card Files Files and versions

gmail-classifier / README.md

Ippoboi's picture

Update README.md

cc6a49d verified 6 days ago

|

history blame contribute delete

3.16 kB

	---
	base_model:
	- google/flan-t5-small
	license: apache-2.0
	language:
	- en
	- fr
	tags:
	- classification
	- emails
	- text2text-generation
	- onnx
	- mobile
	---

	# Gmail Email Classifier (FLAN-T5 ONNX)

	A fine-tuned FLAN-T5-small model for email classification, optimized for on-device inference in mobile apps using ONNX Runtime.

	## Model Description

	This model classifies emails into 5 categories and determines if action is required:

	\| Category \| Description \|
	\|----------\|-------------\|
	\| PERSONAL \| 1:1 human communication, social messages \|
	\| NEWSLETTER \| Marketing, promotions, subscribed content \|
	\| TRANSACTION \| Orders, receipts, payments, confirmations \|
	\| ALERT \| Security notices, important notifications \|
	\| SOCIAL \| Social network notifications, community updates \|

	### Output Format

	```
	CATEGORY \| ACTION/NO_ACTION \| Brief summary
	```

	Example:

	```
	Input: "Subject: Your order has shipped\n\nBody: Your order #12345 is on its way..."
	Output: "TRANSACTION \| NO_ACTION \| Order shipment confirmation for #12345"
	```

	## Intended Use

	- Primary: On-device email triage in mobile apps (iOS/Android)
	- Runtime: ONNX Runtime React Native
	- Use case: Prioritizing inbox, filtering noise, surfacing actionable emails

	## Model Details

	\| Attribute \| Value \|
	\|-----------\|-------\|
	\| Base Model \| `google/flan-t5-small` \|
	\| Parameters \| ~80M \|
	\| Architecture \| T5 Encoder-Decoder \|
	\| ONNX Size \| 357 MB (encoder: 141 MB, decoder: 232 MB) \|
	\| Latency \| ~79ms (iPhone, CPU) \|
	\| Max Sequence \| 512 tokens \|

	## Training Data

	- Size: 2,043 training / 256 validation / 255 test examples
	- Source: Personal Gmail inboxes (anonymized)
	- Languages: English, French
	- Labeling: Human-annotated with category + action flag

	## How to Use

	### ONNX Runtime (React Native)

	```typescript
	import { InferenceSession } from 'onnxruntime-react-native';

	const encoder = await InferenceSession.create('encoder_model.onnx');
	const decoder = await InferenceSession.create('decoder_model.onnx');

	// Tokenize input, run encoder, greedy decode
	```

	### Python (Transformers)

	```python
	from transformers import T5ForConditionalGeneration, T5Tokenizer

	model = T5ForConditionalGeneration.from_pretrained("ippoboi/gmail-classifier")
	tokenizer = T5Tokenizer.from_pretrained("ippoboi/gmail-classifier")

	input_text = "Classify this email: Subject: Meeting tomorrow\n\nBody: Can we reschedule?"
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	# Output: "PERSONAL \| ACTION \| Request to reschedule meeting"
	```

	## Files

	\| File \| Size \| Description \|
	\|------\|------\|-------------\|
	\| `encoder_model.onnx` \| 141 MB \| ONNX encoder \|
	\| `decoder_model.onnx` \| 232 MB \| ONNX decoder \|
	\| `tokenizer.json` \| 2.4 MB \| SentencePiece tokenizer \|
	\| `config.json` \| 2 KB \| Model configuration \|

	## Limitations

	- Trained primarily on English/French emails
	- May not generalize well to enterprise/corporate email patterns
	- Classification accuracy depends on email content quality (plain text preferred over HTML-heavy)

	## License

	Apache 2.0