|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
- precision |
|
|
- recall |
|
|
- f1 |
|
|
library_name: keras |
|
|
tags: |
|
|
- finance |
|
|
--- |
|
|
# ๐ฐ Bank Churn Prediction โ AI for Smarter Customer Retention |
|
|
|
|
|
 |
|
|
|
|
|
## ๐งฉ Overview |
|
|
Businesses like banks which provide service have to worry about the problem of *Customer Churn*, i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. |
|
|
Management can concentrate efforts on improvement of service, keeping in mind these priorities. |
|
|
|
|
|
**Objective** |
|
|
You as a Data Scientist with the bank need to build a neural networkโbased classifier that can determine whether a customer will leave the bank or not in the next 6 months. |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ค Model Details |
|
|
- **Model Type:** Feed-forward ANN (Artificial Neural Network) โ Binary Classifier |
|
|
- **Framework:** TensorFlow / Keras |
|
|
- **Dataset:** `Churn.csv` *(Bank Customer Churn Dataset, 10,000+ customers)* |
|
|
- **Input:** Structured customer profile (credit score, age, balance, tenure, activity, salary, etc.) |
|
|
- **Output:** Binary churn prediction (`0 = stays`, `1 = churn`) |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Data Dictionary |
|
|
| Feature | Description | |
|
|
|---------------------|----------------------------------------------------------------------------| |
|
|
| `CustomerId` | Unique ID assigned to each customer | |
|
|
| `Surname` | Customerโs last name | |
|
|
| `CreditScore` | Customer credit history | |
|
|
| `Geography` | Customer location | |
|
|
| `Gender` | Gender of the customer | |
|
|
| `Age` | Age of the customer | |
|
|
| `Tenure` | Number of years with the bank | |
|
|
| `NumOfProducts` | Number of bank products purchased | |
|
|
| `Balance` | Account balance | |
|
|
| `HasCrCard` | Whether the customer has a credit card | |
|
|
| `EstimatedSalary` | Estimated salary | |
|
|
| `IsActiveMember` | Whether the customer is an active member | |
|
|
| `Exited` | Target label โ 0: No (customer stays), 1: Yes (customer churns) | |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Why It Matters |
|
|
โ
Detects early signs of potential churn |
|
|
โ
Enables targeted retention strategies |
|
|
โ
Improves customer engagement and loyalty |
|
|
โ
Helps maximize profitability and reduce attrition rates |
|
|
|
|
|
๐ **Full Source Notebook:** |
|
|
The complete training and evaluation notebook is available on GitHub: |
|
|
๐ [View on GitHub](https://github.com/joyjitroy/Machine_Learning/blob/main/Bank_Customer_Churn_Prediction_using_Artificial_Neural_Networks.ipynb) |
|
|
|
|
|
--- |
|
|
## ๐ Example Usage |
|
|
|
|
|
```python |
|
|
# Inference example aligned to your dataset schema |
|
|
|
|
|
import pandas as pd |
|
|
import numpy as np |
|
|
|
|
|
CSV_PATH = "Data/Churn.csv" |
|
|
TARGET = "Exited" |
|
|
|
|
|
# Columns exactly as in your table |
|
|
ALL_COLS = [ |
|
|
"CustomerId","Surname","CreditScore","Geography","Gender","Age","Tenure", |
|
|
"NumOfProducts","Balance","HasCrCard","EstimatedSalary","IsActiveMember","Exited" |
|
|
] |
|
|
|
|
|
NUM_COLS = ["CreditScore","Age","Tenure","NumOfProducts","Balance","EstimatedSalary","HasCrCard","IsActiveMember"] |
|
|
CAT_COLS = ["Geography","Gender"] |
|
|
DROP_COLS = ["CustomerId","Surname"] |
|
|
|
|
|
# Load data |
|
|
df = pd.read_csv(CSV_PATH)[ALL_COLS] |
|
|
|
|
|
def prepare_features(df_in, fit_cols=None): |
|
|
X = df_in.drop(columns=[TARGET] + DROP_COLS).copy() |
|
|
# one hot on the two categoricals |
|
|
X = pd.get_dummies(X, columns=CAT_COLS, drop_first=True) |
|
|
# align to training columns |
|
|
if fit_cols is not None: |
|
|
X = X.reindex(columns=fit_cols, fill_value=0) |
|
|
return X |
|
|
|
|
|
# If you trained in this notebook, reuse `model` and `feature_cols` from training: |
|
|
# model.save("bank_churn_ann.keras") |
|
|
# feature_cols = X_train.columns.tolist() |
|
|
|
|
|
# If loading a saved model: |
|
|
# from tensorflow.keras.models import load_model |
|
|
# model = load_model("bank_churn_ann.keras") |
|
|
# feature_cols = [...] # same list you used during training after get_dummies |
|
|
|
|
|
# Single example constructed with your schema |
|
|
example = { |
|
|
"CustomerId": 15788241, |
|
|
"Surname": "Smith", |
|
|
"CreditScore": 600, |
|
|
"Geography": "Germany", # France, Germany, Spain in this dataset |
|
|
"Gender": "Male", # Male or Female |
|
|
"Age": 40, |
|
|
"Tenure": 3, |
|
|
"NumOfProducts": 2, |
|
|
"Balance": 60000.0, |
|
|
"HasCrCard": 1, |
|
|
"EstimatedSalary": 50000.0, |
|
|
"IsActiveMember": 1, |
|
|
"Exited": 0 # ignored at inference |
|
|
} |
|
|
|
|
|
ex_df = pd.DataFrame([example]) |
|
|
X_ex = prepare_features(ex_df, fit_cols=feature_cols) |
|
|
|
|
|
pred_prob = float(model.predict(X_ex, verbose=0)[0][0]) |
|
|
pred = int(pred_prob >= 0.5) |
|
|
|
|
|
print(f"Churn Probability: {pred_prob:.4f}") |
|
|
print("Prediction:", "Likely to churn" if pred else "Likely to stay") |
|
|
|
|
|
|