Sparkonix's picture
added Project Report
aea4b1f
metadata
title: Email Classification API
emoji: πŸ“§
colorFrom: blue
colorTo: green
sdk: docker
app_file: main.py
pinned: false

Email Classification for Support Team

Project Overview

This project implements an email classification system that categorizes support emails into predefined categories while ensuring that personal information (PII) is masked before processing. The system uses a combination of Named Entity Recognition (NER) techniques for PII masking and a pre-trained XLM-RoBERTa model for email classification.

Key Features

  1. Email Classification: Classifies support emails into four categories:

    • Incident
    • Request
    • Change
    • Problem
  2. Personal Information Masking: Detects and masks the following types of PII:

    • Full Name ("full_name")
    • Email Address ("email")
    • Phone number ("phone_number")
    • Date of birth ("dob")
    • Aadhar card number ("aadhar_num")
    • Credit/Debit Card Number ("credit_debit_no")
    • CVV number ("cvv_no")
    • Card expiry number ("expiry_no")
  3. API Interface: Exposes the solution as a RESTful API endpoint.

Project Structure

.
β”œβ”€β”€ classification_model/    # Local model files (not used in deployment)
β”œβ”€β”€ docker-compose.yml       # Docker Compose configuration
β”œβ”€β”€ Dockerfile               # Docker configuration
β”œβ”€β”€ main.py                  # Main FastAPI application
β”œβ”€β”€ models.py                # Email classifier model implementation
β”œβ”€β”€ README.md                # Project documentation
β”œβ”€β”€ requirements.txt         # Python dependencies
└── utils.py                 # PII masker implementation

Installation

Prerequisites

  • Python 3.8+
  • Docker (optional)
  • Hugging Face account for model hosting

Setup

  1. Clone the repository:

    git clone <repository-url>
    cd email_classifier_project
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run the application:

    python main.py
    

Using Docker

  1. Build and run with Docker Compose:
    docker-compose up
    

Uploading the Model to Hugging Face Hub

Before deploying the application to Hugging Face Spaces, you need to upload the model to the Hugging Face Model Hub:

  1. Install the Hugging Face CLI if you haven't already:

    pip install huggingface_hub
    
  2. Log in to Hugging Face:

    huggingface-cli login
    
  3. Create a new model repository on Hugging Face:

    huggingface-cli repo create email-classifier-model
    
  4. Upload the model using Python:

    from transformers import XLMRobertaForSequenceClassification, XLMRobertaTokenizer
    
    # Load the local model
    model = XLMRobertaForSequenceClassification.from_pretrained("classification_model")
    tokenizer = XLMRobertaTokenizer.from_pretrained("classification_model")
    
    # Push to Hugging Face Hub
    model.push_to_hub("YourUsername/email-classifier-model")
    tokenizer.push_to_hub("YourUsername/email-classifier-model")
    
  5. Update the MODEL_PATH environment variable in the Dockerfile with your Hugging Face model path:

    ENV MODEL_PATH="YourUsername/email-classifier-model"
    

API Usage

The API exposes a single endpoint for email classification:

  • Endpoint: /classify
  • Method: POST
  • Input Format:
    {
      "input_email_body": "string containing the email"
    }
    
  • Output Format:
    {
      "input_email_body": "string containing the email",
      "list_of_masked_entities": [
        {
          "position": [start_index, end_index],
          "classification": "entity_type",
          "entity": "original_entity_value"
        }
      ],
      "masked_email": "string containing the masked email",
      "category_of_the_email": "string containing the class"
    }
    

Example

import requests

url = "https://sparkonix-email-classification-model.hf.space/classify"
data = {
    "input_email_body": "Hello, my name is John Doe, and I'm having issues with my account."
}

response = requests.post(url, json=data)
print(response.json())

Deployment to Hugging Face Spaces

  1. Create a new Space on Hugging Face:

  2. Connect your GitHub repository to the Space:

    • In the Space settings, go to "Repository"
    • Enter your GitHub repository URL
    • Authenticate with GitHub if prompted
  3. Ensure your Hugging Face Space has access to the model:

    • Go to your model on Hugging Face Hub
    • Go to "Settings" > "Collaborators"
    • Add your Space as a collaborator with "Read" access
  4. Your API will be available at:

    https://username-space-name.hf.space/classify
    

Technologies Used

  • FastAPI: Web framework for building the API
  • SpaCy: NLP library for PII detection and masking
  • Transformers: Hugging Face library for the email classification model
  • PyTorch: Deep learning framework
  • Docker: Containerization for deployment