Spaces:
Running
Running
File size: 5,760 Bytes
854c114 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | {
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ML Practice Series: Module 23 - Model Explainability (SHAP)\n",
"\n",
"Welcome to the final \"Industry-Grade\" module! **Model Explainability** is about knowing *why* your model made a decision. This is critical for building trust, especially in sensitive areas like finance or medicine.\n",
"\n",
"### Objectives:\n",
"1. **Global Interpretability**: Which features matter most across the whole dataset?\n",
"2. **Local Interpretability**: Why was *this specific person* denied a loan?\n",
"3. **SHAP values**: Game-theoretic approach to feature contribution.\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Setup\n",
"We will use a small Random Forest classifier on the **Breast Cancer** dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"from sklearn.datasets import load_breast_cancer\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"# Note: You will need to install shap: pip install shap\n",
"import shap\n",
"\n",
"# Load data\n",
"data = load_breast_cancer()\n",
"X = pd.DataFrame(data.data, columns=data.feature_names)\n",
"y = data.target\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n",
"\n",
"# Train a model\n",
"model = RandomForestClassifier(n_estimators=100, random_state=42)\n",
"model.fit(X_train, y_train)\n",
"\n",
"print(\"Model trained!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Using SHAP (Global)\n",
"\n",
"### Task 1: Summary Plot\n",
"Create a SHAP Tree Explainer and plot a summary of the feature importances. This is more detailed than standard feature importance as it shows the direction (positive/negative) of the impact."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# YOUR CODE HERE\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<details>\n",
"<summary><b>Click to see Solution</b></summary>\n",
"\n",
"```python\n",
"explainer = shap.TreeExplainer(model)\n",
"shap_values = explainer.shap_values(X_test)\n",
"\n",
"# For binary classification, use [1] for the positive class\n",
"shap.summary_plot(shap_values[1], X_test)\n",
"```\n",
"</details>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Local Performance\n",
"\n",
"### Task 2: Force Plot\n",
"Pick the first person in the test set and explain the model's prediction for them specifically using a force plot."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# YOUR CODE HERE\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<details>\n",
"<summary><b>Click to see Solution</b></summary>\n",
"\n",
"```python\n",
"# Plot for the first record in the test set\n",
"shap.initjs()\n",
"shap.force_plot(explainer.expected_value[1], shap_values[1][0,:], X_test.iloc[0,:])\n",
"```\n",
"</details>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--- \n",
"### The Ultimate Skill Unlocked! \n",
"You can now explain black-box models to humans. This is the mark of a top-tier Data Scientist.\n",
"You have completed all 23 modules of the master series!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
} |