File size: 4,044 Bytes
36489d1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
---
license: mit
license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
language:
- en
pipeline_tag: text-classification
tags:
- email-classification
- mlx
- phi-3
- lora
- text-classification
library_name: mlx
base_model: microsoft/Phi-3-mini-4k-instruct
datasets:
- private
widget:
- text: "Classify this email:\n\nYour order #12345 has been shipped and will arrive in 3-5 business days.\n\nCategory:"
example_title: "Transactional Email"
- text: "Classify this email:\n\n🎉 Limited Time Offer! Get 50% off all products this weekend only!\n\nCategory:"
example_title: "Promotional Email"
- text: "Classify this email:\n\nYour password was changed on December 7, 2025. If you didn't make this change, please contact support immediately.\n\nCategory:"
example_title: "Security Alert"
---
# Email Classifier - Phi-3 Mini Fine-tuned
This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for email classification tasks. It uses LoRA (Low-Rank Adaptation) for efficient fine-tuning on Apple Silicon using the MLX framework.
## Model Description
- **Base Model**: microsoft/Phi-3-mini-4k-instruct
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Framework**: Apple MLX
- **Task**: Email Classification
- **Categories**: 20 email categories including promotional, transactional, notification, security, event, educational, newsletter, survey, business, personal, and more
## Intended Use
This model classifies emails into predefined categories to help with inbox organization, email filtering, and workflow automation.
### Direct Use
```python
from mlx_lm import load, generate
# Load the model
model, tokenizer = load("jake-watkins/email-classifier")
# Classify an email
email_content = """
Your subscription to Premium Service will renew on January 1st, 2026.
To cancel or modify your subscription, visit your account settings.
"""
prompt = f"Classify this email:\n\n{email_content}\n\nCategory:"
response = generate(model, tokenizer, prompt=prompt, max_tokens=50, verbose=False)
print(response)
```
## Training Data
The model was trained on a private dataset of email examples across 20 categories:
- promotional
- transactional
- notification
- security
- event
- educational
- newsletter
- survey
- business
- personal
- solicitation
- recruitment
- membership
- political
- informative
- account
- press
- memorial
- file
- admission
## Training Procedure
### Training Hyperparameters
- **Iterations**: 699
- **Learning Rate**: 1e-5
- **Batch Size**: 1
- **Max Sequence Length**: 512 tokens
- **LoRA Layers**: 16
- **Steps per Eval**: 100
- **Validation Batches**: 25
### Framework
Fine-tuned using MLX-LM on Apple Silicon with LoRA adapters for parameter-efficient training.
## Evaluation
The model was validated on a held-out test set with stratified sampling to maintain category distribution across training, validation, and test splits (80/10/10).
## Limitations
- **Language**: Primarily trained on English emails
- **Context Length**: Optimized for sequences up to 512 tokens; longer emails are truncated
- **Categories**: Limited to the 20 predefined categories; may not generalize to novel email types
- **Domain**: Performance may vary on highly specialized or domain-specific emails
## Ethical Considerations
This model is intended for email organization and automation purposes. Users should:
- Ensure compliance with privacy regulations when processing email content
- Not use for unauthorized email monitoring or surveillance
- Be aware that classification errors may occur
## Citation
If you use this model, please cite the base model:
```bibtex
@article{abdin2024phi,
title={Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone},
author={Abdin, Marah and others},
journal={arXiv preprint arXiv:2404.14219},
year={2024}
}
```
## Model Card Contact
For questions or feedback about this model, please open an issue on the model repository.
|