| --- |
| title: Customer Churn Prediction |
| emoji: π |
| colorFrom: blue |
| colorTo: purple |
| sdk: streamlit |
| sdk_version: "1.37.0" |
| python_version: "3.10" |
| app_file: app.py |
| pinned: false |
| ---
|
|
|
| π Customer Churn Prediction
|
|
|
| An end-to-end Machine Learning project that predicts whether a telecom customer is likely to churn based on usage patterns, subscription details, and customer profile data. The project includes data preprocessing, feature engineering, model training, explainability using SHAP, and deployment using Streamlit.
|
|
|
| π Live Demo
|
|
|
| link:
|
|
|
| π Problem Statement
|
|
|
| Telecom companies lose revenue when customers stop using their services (churn).
|
| The goal of this project is to:
|
|
|
| Predict whether a customer will churn or stay, so that businesses can take proactive retention actions.
|
|
|
| π Dataset
|
|
|
| We use the IBM Telco Customer Churn Dataset.
|
|
|
| It contains information such as:
|
|
|
| Customer demographics
|
| Subscription services
|
| Contract type
|
| Payment methods
|
| Monthly & total charges
|
| Churn status
|
|
|
| Target variable:
|
|
|
| Churn Value (1 = Churn, 0 = No Churn)
|
| π§ Machine Learning Workflow
|
|
|
| The project follows a complete ML pipeline:
|
|
|
| 1. Data Preprocessing
|
| Removed irrelevant columns (CustomerID, location data, etc.)
|
| Handled missing values
|
| Cleaned dataset
|
|
|
| 2. Feature Engineering
|
| Created Tenure Groups:
|
| New
|
| Regular
|
| Loyal
|
| Very Loyal
|
| Encoded categorical variables using One-Hot Encoding
|
|
|
| 3. Handling Class Imbalance
|
| Used class_weight='balanced'
|
| Used scale_pos_weight for XGBoost
|
|
|
| 4. Model Training
|
|
|
| We trained and compared:
|
|
|
| Logistic Regression
|
| Random Forest
|
| XGBoost
|
|
|
| 5. Evaluation Metrics
|
| Accuracy
|
| Precision
|
| Recall
|
| F1 Score
|
| ROC-AUC
|
|
|
| 6. Explainability (SHAP)
|
| Identified important features affecting churn
|
| Provided model interpretability
|
|
|
| 7. Deployment
|
| Built interactive web app using Streamlit
|
| Real-time churn prediction system
|
|
|
| π Model Performance
|
| Model Accuracy Precision Recall F1 Score ROC-AUC
|
| Logistic Regression 0.737 0.503 0.773 0.610 0.843
|
| Random Forest 0.793 0.640 0.500 0.562 0.840
|
| XGBoost 0.769 0.553 0.687 0.613 0.833
|
|
|
| β
Best Model:
|
|
|
| Logistic Regression (based on highest Recall & ROC-AUC)
|
|
|
| π Key Insights from EDA
|
| Customers with short tenure are more likely to churn
|
| Month-to-month contracts have the highest churn rate
|
| Higher monthly charges increase churn probability
|
| Customers without support/security services churn more
|
| π SHAP Explainability
|
|
|
| SHAP analysis revealed the most important features:
|
|
|
| Contract type
|
| Tenure months
|
| Monthly charges
|
| Internet service type
|
|
|
| These features strongly influence churn behavior.
|
|
|
| π₯οΈ Streamlit App Features
|
|
|
| The deployed app allows users to:
|
|
|
| Enter customer details
|
| Get churn probability in real-time
|
| View risk level (High / Low)
|
| Interactive UI for easy testing
|
| π οΈ Tech Stack
|
| Python π
|
| Pandas & NumPy
|
| Scikit-learn
|
| XGBoost
|
| SHAP
|
| Streamlit
|
| Matplotlib & Seaborn
|
|
|
| π Project Structure
|
| customer-churn-project/
|
| β
|
| βββ app.py
|
| βββ churn_model.pkl
|
| βββ model_columns.pkl
|
| βββ requirements.txt
|
| βββ Telco-Customer-Churn.csv
|
| β
|
| βββ notebooks/
|
| β βββ eda_and_modeling.ipynb
|
| β
|
| βββ README.md
|
|
|
| βοΈ Installation & Setup
|
| 1. Clone Repository
|
| git clone https://github.com/your-username/churn-prediction.git
|
| cd churn-prediction
|
| 2. Install Dependencies
|
| pip install -r requirements.txt
|
| 3. Run Streamlit App
|
| streamlit run app.py
|
| π¦ Requirements
|
| streamlit
|
| pandas
|
| numpy
|
| scikit-learn
|
| xgboost
|
| joblib
|
| matplotlib
|
| seaborn
|
| shap
|
|
|
| π Business Impact
|
|
|
| This system helps telecom companies to:
|
|
|
| Identify at-risk customers early
|
| Reduce customer churn rate
|
| Improve retention strategies
|
| Increase revenue stability
|
| π Future Improvements
|
| Hyperparameter tuning (Optuna / GridSearch)
|
| Deep learning model comparison
|
| API deployment using FastAPI
|
| Dashboard with Power BI / Tableau
|
| Automated retraining pipeline
|
| π¨βπ» Author
|
|
|
| Mohd Faizanullah
|
| Machine Learning Enthusiast | AI Developer
|
|
|
| β If you like this project
|
|
|
| Give this repo a β and connect for more ML projects! |