title: Email Classification API
emoji: π§
colorFrom: blue
colorTo: green
sdk: docker
app_file: main.py
pinned: false
Email Classification for Support Team
Project Overview
This project implements an email classification system that categorizes support emails into predefined categories while ensuring that personal information (PII) is masked before processing. The system uses a combination of Named Entity Recognition (NER) techniques for PII masking and a pre-trained XLM-RoBERTa model for email classification.
Key Features
Email Classification: Classifies support emails into four categories:
- Incident
- Request
- Change
- Problem
Personal Information Masking: Detects and masks the following types of PII:
- Full Name ("full_name")
- Email Address ("email")
- Phone number ("phone_number")
- Date of birth ("dob")
- Aadhar card number ("aadhar_num")
- Credit/Debit Card Number ("credit_debit_no")
- CVV number ("cvv_no")
- Card expiry number ("expiry_no")
API Interface: Exposes the solution as a RESTful API endpoint.
Project Structure
.
βββ classification_model/ # Local model files (not used in deployment)
βββ docker-compose.yml # Docker Compose configuration
βββ Dockerfile # Docker configuration
βββ main.py # Main FastAPI application
βββ models.py # Email classifier model implementation
βββ README.md # Project documentation
βββ requirements.txt # Python dependencies
βββ utils.py # PII masker implementation
Installation
Prerequisites
- Python 3.8+
- Docker (optional)
- Hugging Face account for model hosting
Setup
Clone the repository:
git clone <repository-url> cd email_classifier_projectInstall dependencies:
pip install -r requirements.txtRun the application:
python main.py
Using Docker
- Build and run with Docker Compose:
docker-compose up
Uploading the Model to Hugging Face Hub
Before deploying the application to Hugging Face Spaces, you need to upload the model to the Hugging Face Model Hub:
Install the Hugging Face CLI if you haven't already:
pip install huggingface_hubLog in to Hugging Face:
huggingface-cli loginCreate a new model repository on Hugging Face:
huggingface-cli repo create email-classifier-modelUpload the model using Python:
from transformers import XLMRobertaForSequenceClassification, XLMRobertaTokenizer # Load the local model model = XLMRobertaForSequenceClassification.from_pretrained("classification_model") tokenizer = XLMRobertaTokenizer.from_pretrained("classification_model") # Push to Hugging Face Hub model.push_to_hub("YourUsername/email-classifier-model") tokenizer.push_to_hub("YourUsername/email-classifier-model")Update the
MODEL_PATHenvironment variable in the Dockerfile with your Hugging Face model path:ENV MODEL_PATH="YourUsername/email-classifier-model"
API Usage
The API exposes a single endpoint for email classification:
- Endpoint:
/classify - Method: POST
- Input Format:
{ "input_email_body": "string containing the email" } - Output Format:
{ "input_email_body": "string containing the email", "list_of_masked_entities": [ { "position": [start_index, end_index], "classification": "entity_type", "entity": "original_entity_value" } ], "masked_email": "string containing the masked email", "category_of_the_email": "string containing the class" }
Example
import requests
url = "https://sparkonix-email-classification-model.hf.space/classify"
data = {
"input_email_body": "Hello, my name is John Doe, and I'm having issues with my account."
}
response = requests.post(url, json=data)
print(response.json())
Deployment to Hugging Face Spaces
Create a new Space on Hugging Face:
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Choose a name for your Space
- Select "Docker" as the Space SDK
Connect your GitHub repository to the Space:
- In the Space settings, go to "Repository"
- Enter your GitHub repository URL
- Authenticate with GitHub if prompted
Ensure your Hugging Face Space has access to the model:
- Go to your model on Hugging Face Hub
- Go to "Settings" > "Collaborators"
- Add your Space as a collaborator with "Read" access
Your API will be available at:
https://username-space-name.hf.space/classify
Technologies Used
- FastAPI: Web framework for building the API
- SpaCy: NLP library for PII detection and masking
- Transformers: Hugging Face library for the email classification model
- PyTorch: Deep learning framework
- Docker: Containerization for deployment