E-Commerce Return Abuse Detector
This repository contains a production-ready Random Forest Classifier pipeline trained to identify and categorize e-commerce return behaviors into four distinct risk profiles:
- 0: Legitimate Return
- 1: Policy Abuser
- 2: Fraudulent Return
- 3: Wardrobing
Training & Source Code
The entire data engineering and training process for this model was conducted in a cloud environment. You can review, fork, and run the complete step-by-step implementation code here:
๐ View the Core Training Notebook on Kaggle
Pipeline Details
The model architecture is frozen as a unified scikit-learn Pipeline containing:
- Preprocessing Layer: A
ColumnTransformerhandling categorical features via an integer-safeOrdinalEncoder(handle_unknown='use_encoded_value', unknown_value=-1). - Classifier Layer: A
RandomForestClassifierinitialized with 100 estimators and optimized depth constraints to handle complex behavioral patterns.
Feature Importance
According to the training logs preserved in the notebook, the top behavioral drivers for identifying abuse are:
return_rate_pct(Customer's overall return velocity)customer_support_contacts(Frequency of escalations)days_to_return(Average window before an item is shipped back)
Quickstart: How to Load the Pipeline
import joblib
from huggingface_hub import hf_hub_download
# 1. Download the frozen pipeline from this Hub repository
model_path = hf_hub_download(
repo_id="sarveshchhetri/ecommerce-return-abuse-detector",
filename="model.joblib"
)
# 2. Load the pipeline directly into your local script
pipeline = joblib.load(model_path)
# The pipeline is fully ready to accept raw input data and execute predictions!