Email Classifier - Phi-3 Mini Fine-tuned

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct for email classification tasks. It uses LoRA (Low-Rank Adaptation) for efficient fine-tuning on Apple Silicon using the MLX framework.

Model Description

Base Model: microsoft/Phi-3-mini-4k-instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Framework: Apple MLX
Task: Email Classification
Categories: 20 email categories including promotional, transactional, notification, security, event, educational, newsletter, survey, business, personal, and more

Intended Use

This model classifies emails into predefined categories to help with inbox organization, email filtering, and workflow automation.

Direct Use

from mlx_lm import load, generate

# Load the model
model, tokenizer = load("jake-watkins/email-classifier")

# Classify an email
email_content = """
Your subscription to Premium Service will renew on January 1st, 2026.
To cancel or modify your subscription, visit your account settings.
"""

prompt = f"Classify this email:\n\n{email_content}\n\nCategory:"

response = generate(model, tokenizer, prompt=prompt, max_tokens=50, verbose=False)
print(response)

Training Data

The model was trained on a private dataset of email examples across 20 categories:

promotional
transactional
notification
security
event
educational
newsletter
survey
business
personal
solicitation
recruitment
membership
political
informative
account
press
memorial
file
admission

Training Procedure

Training Hyperparameters

Iterations: 699
Learning Rate: 1e-5
Batch Size: 1
Max Sequence Length: 512 tokens
LoRA Layers: 16
Steps per Eval: 100
Validation Batches: 25

Framework

Fine-tuned using MLX-LM on Apple Silicon with LoRA adapters for parameter-efficient training.

Evaluation

The model was validated on a held-out test set with stratified sampling to maintain category distribution across training, validation, and test splits (80/10/10).

Limitations

Language: Primarily trained on English emails
Context Length: Optimized for sequences up to 512 tokens; longer emails are truncated
Categories: Limited to the 20 predefined categories; may not generalize to novel email types
Domain: Performance may vary on highly specialized or domain-specific emails

Ethical Considerations

This model is intended for email organization and automation purposes. Users should:

Ensure compliance with privacy regulations when processing email content
Not use for unauthorized email monitoring or surveillance
Be aware that classification errors may occur

Citation

If you use this model, please cite the base model:

@article{abdin2024phi,
  title={Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone},
  author={Abdin, Marah and others},
  journal={arXiv preprint arXiv:2404.14219},
  year={2024}
}

Model Card Contact

For questions or feedback about this model, please open an issue on the model repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

Quantized

Model tree for jake-watkins/email-classifier

Base model

microsoft/Phi-3-mini-4k-instruct

Adapter

(836)

this model

Paper for jake-watkins/email-classifier

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 260