jake-watkins
/

email-classifier

+---
+license: mit
+license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
+language:
+- en
+pipeline_tag: text-classification
+tags:
+- email-classification
+- mlx
+- phi-3
+- lora
+- text-classification
+library_name: mlx
+base_model: microsoft/Phi-3-mini-4k-instruct
+datasets:
+- private
+widget:
+- text: "Classify this email:\n\nYour order #12345 has been shipped and will arrive in 3-5 business days.\n\nCategory:"
+  example_title: "Transactional Email"
+- text: "Classify this email:\n\n🎉 Limited Time Offer! Get 50% off all products this weekend only!\n\nCategory:"
+  example_title: "Promotional Email"
+- text: "Classify this email:\n\nYour password was changed on December 7, 2025. If you didn't make this change, please contact support immediately.\n\nCategory:"
+  example_title: "Security Alert"
+---
+# Email Classifier - Phi-3 Mini Fine-tuned
+This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for email classification tasks. It uses LoRA (Low-Rank Adaptation) for efficient fine-tuning on Apple Silicon using the MLX framework.
+## Model Description
+- **Base Model**: microsoft/Phi-3-mini-4k-instruct
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
+- **Framework**: Apple MLX
+- **Task**: Email Classification
+- **Categories**: 20 email categories including promotional, transactional, notification, security, event, educational, newsletter, survey, business, personal, and more
+## Intended Use
+This model classifies emails into predefined categories to help with inbox organization, email filtering, and workflow automation.
+### Direct Use
+```python
+from mlx_lm import load, generate
+# Load the model
+model, tokenizer = load("jake-watkins/email-classifier")
+# Classify an email
+email_content = """
+Your subscription to Premium Service will renew on January 1st, 2026.
+To cancel or modify your subscription, visit your account settings.
+"""
+prompt = f"Classify this email:\n\n{email_content}\n\nCategory:"
+response = generate(model, tokenizer, prompt=prompt, max_tokens=50, verbose=False)
+print(response)
+```
+## Training Data
+The model was trained on a private dataset of email examples across 20 categories:
+- promotional
+- transactional
+- notification
+- security
+- event
+- educational
+- newsletter
+- survey
+- business
+- personal
+- solicitation
+- recruitment
+- membership
+- political
+- informative
+- account
+- press
+- memorial
+- file
+- admission
+## Training Procedure
+### Training Hyperparameters
+- **Iterations**: 699
+- **Learning Rate**: 1e-5
+- **Batch Size**: 1
+- **Max Sequence Length**: 512 tokens
+- **LoRA Layers**: 16
+- **Steps per Eval**: 100
+- **Validation Batches**: 25
+### Framework
+Fine-tuned using MLX-LM on Apple Silicon with LoRA adapters for parameter-efficient training.
+## Evaluation
+The model was validated on a held-out test set with stratified sampling to maintain category distribution across training, validation, and test splits (80/10/10).
+## Limitations
+- **Language**: Primarily trained on English emails
+- **Context Length**: Optimized for sequences up to 512 tokens; longer emails are truncated
+- **Categories**: Limited to the 20 predefined categories; may not generalize to novel email types
+- **Domain**: Performance may vary on highly specialized or domain-specific emails
+## Ethical Considerations
+This model is intended for email organization and automation purposes. Users should:
+- Ensure compliance with privacy regulations when processing email content
+- Not use for unauthorized email monitoring or surveillance
+- Be aware that classification errors may occur
+## Citation
+If you use this model, please cite the base model:
+```bibtex
+@article{abdin2024phi,
+  title={Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone},
+  author={Abdin, Marah and others},
+  journal={arXiv preprint arXiv:2404.14219},
+  year={2024}
+}
+```
+## Model Card Contact
+For questions or feedback about this model, please open an issue on the model repository.