Spaces:
Sleeping
feat: Rework PoC notebook for clarity and user experience
Browse filesRefactors the PoC notebook to create a clear, step-by-step narrative for a non-technical audience. The new structure guides the user from the initial problem (the student profile) to the final, synthesized recommendation.
Key improvements include:
- A universal, robust setup cell that automatically clones the repo and uses `sys.executable` for reliable package installation.
- Proactively silences common warnings (`TqdmWarning`, `TOKENIZERS_PARALLELISM`) for a clean, error-free execution.
- Simplifies technical concepts using analogies and focuses the narrative on the value delivered to educators.
- Replaces the final interactive prompt with a direct call-to-action, linking to the live Hugging Face Spaces demo for a more polished and interactive conclusion.
- notebooks/fot_recommender_poc.ipynb +288 -195
|
@@ -2,161 +2,149 @@
|
|
| 2 |
"cells": [
|
| 3 |
{
|
| 4 |
"cell_type": "markdown",
|
| 5 |
-
"id": "
|
| 6 |
"metadata": {},
|
| 7 |
"source": [
|
| 8 |
"# Freshman On-Track (FOT) Intervention Recommender\n",
|
| 9 |
"### A Standalone Proof-of-Concept\n",
|
| 10 |
"\n",
|
| 11 |
-
"
|
| 12 |
"\n",
|
| 13 |
-
"
|
| 14 |
]
|
| 15 |
},
|
| 16 |
{
|
| 17 |
"cell_type": "markdown",
|
| 18 |
-
"id": "
|
| 19 |
"metadata": {},
|
| 20 |
"source": [
|
| 21 |
-
"## 1
|
| 22 |
"\n",
|
| 23 |
-
"
|
| 24 |
"\n",
|
| 25 |
-
"
|
| 26 |
-
"1. **Define Project Source**: We specify the official GitHub repository for this project so it's clear where the code comes from.\n",
|
| 27 |
-
"2. **Detect Environment**: The notebook checks if it's running inside the local project folder or as a standalone file.\n",
|
| 28 |
-
"3. **Prepare Environment**: A helper script is called to do the heavy lifting:\n",
|
| 29 |
-
" - If **local**, it uses your existing project files.\n",
|
| 30 |
-
" - If **standalone**, it clones the repository and installs all dependencies for you.\n",
|
| 31 |
-
"\n",
|
| 32 |
-
"After running this one cell, the environment will be ready for the demonstration."
|
| 33 |
]
|
| 34 |
},
|
| 35 |
{
|
| 36 |
"cell_type": "code",
|
| 37 |
-
"execution_count":
|
| 38 |
-
"id": "
|
| 39 |
"metadata": {},
|
| 40 |
"outputs": [
|
| 41 |
{
|
| 42 |
"name": "stdout",
|
| 43 |
"output_type": "stream",
|
| 44 |
"text": [
|
| 45 |
-
"
|
| 46 |
-
"
|
| 47 |
-
"
|
| 48 |
-
"🎉 Local environment is ready!\n"
|
| 49 |
]
|
| 50 |
}
|
| 51 |
],
|
| 52 |
"source": [
|
| 53 |
-
"import sys\n",
|
| 54 |
"from pathlib import Path\n",
|
|
|
|
| 55 |
"\n",
|
| 56 |
-
"#
|
| 57 |
-
"
|
| 58 |
-
"
|
| 59 |
-
"\n",
|
| 60 |
-
"# print(\"🚀 Setting up the environment...\")\n",
|
| 61 |
-
"\n",
|
| 62 |
-
"# # --- Clone the Repository & Install Dependencies ---\n",
|
| 63 |
-
"# !git clone -q {REPO_URL}\n",
|
| 64 |
-
"# %pip install -q -r {PROJECT_DIR_NAME}/requirements.txt\n",
|
| 65 |
-
"\n",
|
| 66 |
-
"# # --- Configure Python Path ---\n",
|
| 67 |
-
"# project_path = Path.cwd() / PROJECT_DIR_NAME\n",
|
| 68 |
-
"# src_path = project_path / \"src\"\n",
|
| 69 |
-
"# sys.path.insert(0, str(src_path))\n",
|
| 70 |
-
"\n",
|
| 71 |
-
"# print(\"\\n🎉 Environment is ready!\")\n",
|
| 72 |
"\n",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
"\n",
|
| 74 |
-
"
|
|
|
|
|
|
|
|
|
|
| 75 |
"\n",
|
| 76 |
-
"#
|
| 77 |
-
"project_path = Path
|
| 78 |
"\n",
|
| 79 |
-
"
|
| 80 |
-
"src_path = project_path / \"src\"\n",
|
| 81 |
-
"if str(src_path) not in sys.path:\n",
|
| 82 |
-
" sys.path.insert(0, str(src_path))\n",
|
| 83 |
-
"\n",
|
| 84 |
-
"print(f\" - Using local project root: {project_path}\")\n",
|
| 85 |
-
"print(\"\\n🎉 Local environment is ready!\")"
|
| 86 |
]
|
| 87 |
},
|
| 88 |
{
|
| 89 |
"cell_type": "markdown",
|
| 90 |
-
"id": "
|
| 91 |
"metadata": {},
|
| 92 |
"source": [
|
| 93 |
-
"## 2
|
|
|
|
|
|
|
| 94 |
"\n",
|
| 95 |
-
"
|
| 96 |
]
|
| 97 |
},
|
| 98 |
{
|
| 99 |
"cell_type": "code",
|
| 100 |
"execution_count": 2,
|
| 101 |
-
"id": "
|
| 102 |
"metadata": {},
|
| 103 |
"outputs": [
|
| 104 |
-
{
|
| 105 |
-
"name": "stderr",
|
| 106 |
-
"output_type": "stream",
|
| 107 |
-
"text": [
|
| 108 |
-
"/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
|
| 109 |
-
" from .autonotebook import tqdm as notebook_tqdm\n"
|
| 110 |
-
]
|
| 111 |
-
},
|
| 112 |
-
{
|
| 113 |
-
"name": "stdout",
|
| 114 |
-
"output_type": "stream",
|
| 115 |
-
"text": [
|
| 116 |
-
"Successfully loaded 27 intervention chunks.\n"
|
| 117 |
-
]
|
| 118 |
-
},
|
| 119 |
{
|
| 120 |
"data": {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
"text/plain": [
|
| 122 |
-
"
|
| 123 |
-
" 'source_document': 'NCS_OTToolkit_2ndEd_October_2017_updated.pdf',\n",
|
| 124 |
-
" 'fot_pages': 'Pages: 44',\n",
|
| 125 |
-
" 'content_for_embedding': 'Title: Strategy: Leadership Roles. Content: Principal Role:\\n• Implementation: Reviews and interrogates interim freshman success-related data in light of Success Team goals, and strategizes with team leadership around next steps',\n",
|
| 126 |
-
" 'original_content': 'Principal Role:\\n• Implementation: Reviews and interrogates interim freshman success-related data in light of Success Team goals, and strategizes with team leadership around next steps'}"
|
| 127 |
]
|
| 128 |
},
|
| 129 |
-
"execution_count": 2,
|
| 130 |
"metadata": {},
|
| 131 |
-
"output_type": "
|
| 132 |
}
|
| 133 |
],
|
| 134 |
"source": [
|
| 135 |
-
"
|
| 136 |
-
"from fot_recommender.rag_pipeline import (\n",
|
| 137 |
-
" load_knowledge_base,\n",
|
| 138 |
-
" initialize_embedding_model,\n",
|
| 139 |
-
" create_embeddings,\n",
|
| 140 |
-
" create_vector_db,\n",
|
| 141 |
-
" search_interventions,\n",
|
| 142 |
-
")\n",
|
| 143 |
"\n",
|
| 144 |
-
"
|
| 145 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 146 |
"\n",
|
| 147 |
-
"
|
| 148 |
-
"knowledge_base_chunks = load_knowledge_base(str(kb_path))\n",
|
| 149 |
"\n",
|
| 150 |
-
"
|
| 151 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
]
|
| 153 |
},
|
| 154 |
{
|
| 155 |
"cell_type": "code",
|
| 156 |
-
"execution_count":
|
| 157 |
-
"id": "
|
| 158 |
"metadata": {},
|
| 159 |
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 160 |
{
|
| 161 |
"name": "stdout",
|
| 162 |
"output_type": "stream",
|
|
@@ -170,9 +158,9 @@
|
|
| 170 |
"name": "stderr",
|
| 171 |
"output_type": "stream",
|
| 172 |
"text": [
|
| 173 |
-
"Batches: 0%|
|
| 174 |
" return forward_call(*args, **kwargs)\n",
|
| 175 |
-
"Batches: 100
|
| 176 |
]
|
| 177 |
},
|
| 178 |
{
|
|
@@ -181,171 +169,276 @@
|
|
| 181 |
"text": [
|
| 182 |
"Embeddings created successfully.\n",
|
| 183 |
"Creating FAISS index with dimension 384...\n",
|
| 184 |
-
"FAISS index created with 27 vectors.\n"
|
|
|
|
|
|
|
|
|
|
| 185 |
]
|
| 186 |
},
|
| 187 |
{
|
| 188 |
-
"
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
|
| 192 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 193 |
}
|
| 194 |
],
|
| 195 |
"source": [
|
| 196 |
-
"#
|
| 197 |
-
"
|
| 198 |
-
"
|
| 199 |
-
"
|
| 200 |
-
"
|
| 201 |
-
"
|
| 202 |
-
"
|
| 203 |
-
"
|
| 204 |
-
"\n",
|
| 205 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 206 |
"embedding_model = initialize_embedding_model()\n",
|
| 207 |
"\n",
|
| 208 |
-
"#
|
| 209 |
"embeddings = create_embeddings(knowledge_base_chunks, embedding_model)\n",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 210 |
"\n",
|
| 211 |
-
"#
|
| 212 |
-
"
|
|
|
|
|
|
|
| 213 |
]
|
| 214 |
},
|
| 215 |
{
|
| 216 |
"cell_type": "markdown",
|
| 217 |
-
"id": "
|
| 218 |
"metadata": {},
|
| 219 |
"source": [
|
| 220 |
-
"##
|
| 221 |
"\n",
|
| 222 |
-
"
|
| 223 |
"\n",
|
| 224 |
-
"
|
| 225 |
-
"- \"A student is missing a lot of school and their grades are suffering.\"\n",
|
| 226 |
-
"- \"This freshman has good attendance but is failing math and science and seems disengaged.\"\n",
|
| 227 |
-
"- \"A student has multiple behavior incidents and is struggling to connect with teachers.\"\n",
|
| 228 |
"\n",
|
| 229 |
-
"
|
| 230 |
]
|
| 231 |
},
|
| 232 |
{
|
| 233 |
"cell_type": "code",
|
| 234 |
-
"execution_count":
|
| 235 |
-
"id": "
|
| 236 |
"metadata": {},
|
| 237 |
"outputs": [
|
| 238 |
-
{
|
| 239 |
-
"name": "stdin",
|
| 240 |
-
"output_type": "stream",
|
| 241 |
-
"text": [
|
| 242 |
-
"Enter a description of a student's challenges: asdf\n"
|
| 243 |
-
]
|
| 244 |
-
},
|
| 245 |
{
|
| 246 |
"name": "stdout",
|
| 247 |
"output_type": "stream",
|
| 248 |
"text": [
|
| 249 |
"\n",
|
| 250 |
-
"
|
| 251 |
-
"
|
| 252 |
-
"Searching for top 3 interventions for query: 'asdf...'\n",
|
| 253 |
-
"Found 0 relevant interventions.\n",
|
| 254 |
-
"\n",
|
| 255 |
-
"No relevant interventions were found for this query.\n"
|
| 256 |
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 257 |
}
|
| 258 |
],
|
| 259 |
"source": [
|
| 260 |
-
"from
|
| 261 |
-
"\n",
|
| 262 |
-
"# Prompt the user to enter their own query\n",
|
| 263 |
-
"user_query = input(\"Enter a description of a student's challenges: \")\n",
|
| 264 |
"\n",
|
| 265 |
-
"if
|
| 266 |
-
"
|
|
|
|
| 267 |
"\n",
|
| 268 |
-
"
|
| 269 |
-
"
|
| 270 |
-
"
|
| 271 |
-
"
|
| 272 |
-
"
|
| 273 |
-
"
|
| 274 |
-
"
|
|
|
|
|
|
|
| 275 |
" )\n",
|
| 276 |
"\n",
|
| 277 |
-
"
|
| 278 |
-
"
|
| 279 |
-
"\n",
|
| 280 |
-
"else:\n",
|
| 281 |
-
" print(\"\\nNo query entered. Skipping custom search.\")"
|
| 282 |
]
|
| 283 |
},
|
| 284 |
{
|
| 285 |
"cell_type": "markdown",
|
| 286 |
-
"id": "
|
| 287 |
"metadata": {},
|
| 288 |
-
"source": [
|
|
|
|
|
|
|
|
|
|
|
|
|
| 289 |
},
|
| 290 |
{
|
| 291 |
"cell_type": "code",
|
| 292 |
-
"execution_count":
|
| 293 |
-
"id": "
|
| 294 |
"metadata": {},
|
| 295 |
-
"outputs": [
|
| 296 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 297 |
},
|
| 298 |
{
|
| 299 |
"cell_type": "markdown",
|
| 300 |
-
"id": "
|
| 301 |
-
"metadata": {},
|
| 302 |
-
"source": []
|
| 303 |
-
},
|
| 304 |
-
{
|
| 305 |
-
"cell_type": "code",
|
| 306 |
-
"execution_count": null,
|
| 307 |
-
"id": "665cb647-97da-441f-81b0-ae7b908fdd2f",
|
| 308 |
-
"metadata": {},
|
| 309 |
-
"outputs": [],
|
| 310 |
-
"source": []
|
| 311 |
-
},
|
| 312 |
-
{
|
| 313 |
-
"cell_type": "code",
|
| 314 |
-
"execution_count": null,
|
| 315 |
-
"id": "92729636-bd91-4b0c-ac35-5ed82797a1f2",
|
| 316 |
"metadata": {},
|
| 317 |
-
"outputs": [],
|
| 318 |
"source": [
|
| 319 |
-
"
|
| 320 |
-
"from pathlib import Path\n",
|
| 321 |
"\n",
|
| 322 |
-
"
|
| 323 |
-
"project_path_to_clean = Path.cwd() / \"fot-recommender-poc-workspace\"\n",
|
| 324 |
"\n",
|
| 325 |
-
"
|
| 326 |
-
" print(f\"The project directory '{project_path_to_clean}' was found.\")\n",
|
| 327 |
"\n",
|
| 328 |
-
" # Ask for user confirmation before deleting anything\n",
|
| 329 |
-
" response = input(\n",
|
| 330 |
-
" \"Would you like to delete the git repository folder that was downloaded during the running of this notebook? (y/n): \"\n",
|
| 331 |
-
" )\n",
|
| 332 |
"\n",
|
| 333 |
-
"
|
| 334 |
-
" try:\n",
|
| 335 |
-
" shutil.rmtree(project_path_to_clean)\n",
|
| 336 |
-
" print(f\"✅ Successfully deleted '{project_path_to_clean}'.\")\n",
|
| 337 |
-
" except OSError as e:\n",
|
| 338 |
-
" print(f\"Error: {e.strerror}. Could not delete the directory.\")\n",
|
| 339 |
-
" else:\n",
|
| 340 |
-
" print(\"Cleanup skipped.\")\n",
|
| 341 |
-
"else:\n",
|
| 342 |
-
" print(\"Project directory not found. Nothing to clean up.\")"
|
| 343 |
]
|
| 344 |
},
|
| 345 |
{
|
| 346 |
"cell_type": "code",
|
| 347 |
"execution_count": null,
|
| 348 |
-
"id": "
|
| 349 |
"metadata": {},
|
| 350 |
"outputs": [],
|
| 351 |
"source": []
|
|
|
|
| 2 |
"cells": [
|
| 3 |
{
|
| 4 |
"cell_type": "markdown",
|
| 5 |
+
"id": "6508e7df",
|
| 6 |
"metadata": {},
|
| 7 |
"source": [
|
| 8 |
"# Freshman On-Track (FOT) Intervention Recommender\n",
|
| 9 |
"### A Standalone Proof-of-Concept\n",
|
| 10 |
"\n",
|
| 11 |
+
"**Goal:** To show, in a few simple steps, how we can turn a description of a struggling student into a set of clear, actionable, and evidence-based recommendations for an educator.\n",
|
| 12 |
"\n",
|
| 13 |
+
"This notebook demonstrates the core Retrieval-Augmented Generation (RAG) pipeline that powers our recommender."
|
| 14 |
]
|
| 15 |
},
|
| 16 |
{
|
| 17 |
"cell_type": "markdown",
|
| 18 |
+
"id": "62c68815",
|
| 19 |
"metadata": {},
|
| 20 |
"source": [
|
| 21 |
+
"## Step 1: Setting Up the Environment\n",
|
| 22 |
"\n",
|
| 23 |
+
"First, we need to load the project's code and install its dependencies. This cell prepares the notebook to run our custom logic.\n",
|
| 24 |
"\n",
|
| 25 |
+
"*(This notebook is designed to run in Google Colab, but the code below will also adapt to a local environment if the project files are present.)*"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
]
|
| 27 |
},
|
| 28 |
{
|
| 29 |
"cell_type": "code",
|
| 30 |
+
"execution_count": 8,
|
| 31 |
+
"id": "97f37783",
|
| 32 |
"metadata": {},
|
| 33 |
"outputs": [
|
| 34 |
{
|
| 35 |
"name": "stdout",
|
| 36 |
"output_type": "stream",
|
| 37 |
"text": [
|
| 38 |
+
"📦 Setting up the environment...\n",
|
| 39 |
+
"/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/bin/python3: No module named pip\n",
|
| 40 |
+
"✅ Environment is ready!\n"
|
|
|
|
| 41 |
]
|
| 42 |
}
|
| 43 |
],
|
| 44 |
"source": [
|
| 45 |
+
"import sys, os, warnings\n",
|
| 46 |
"from pathlib import Path\n",
|
| 47 |
+
"from tqdm import TqdmWarning\n",
|
| 48 |
"\n",
|
| 49 |
+
"# This prevents common, harmless warnings from cluttering the output.\n",
|
| 50 |
+
"os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"\n",
|
| 51 |
+
"warnings.filterwarnings(\"ignore\", category=TqdmWarning)\n",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
"\n",
|
| 53 |
+
"# Clones the project from GitHub, but only if it doesn't already exist.\n",
|
| 54 |
+
"PROJECT_DIR = \"fot-intervention-recommender\"\n",
|
| 55 |
+
"if not Path(PROJECT_DIR).is_dir():\n",
|
| 56 |
+
" print(\"🚀 Downloading project files...\")\n",
|
| 57 |
+
" !git clone -q https://github.com/chuckfinca/fot-intervention-recommender.git\n",
|
| 58 |
"\n",
|
| 59 |
+
"# Installs packages and adds the project's code to our Python path.\n",
|
| 60 |
+
"print(\"📦 Setting up the environment...\")\n",
|
| 61 |
+
"!{sys.executable} -m pip install -q -r {PROJECT_DIR}/requirements.txt\n",
|
| 62 |
+
"sys.path.insert(0, str(Path(PROJECT_DIR) / \"src\"))\n",
|
| 63 |
"\n",
|
| 64 |
+
"# Define the project_path variable needed by the rest of the notebook\n",
|
| 65 |
+
"project_path = Path(PROJECT_DIR)\n",
|
| 66 |
"\n",
|
| 67 |
+
"print(\"✅ Environment is ready!\")"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
]
|
| 69 |
},
|
| 70 |
{
|
| 71 |
"cell_type": "markdown",
|
| 72 |
+
"id": "f859223d",
|
| 73 |
"metadata": {},
|
| 74 |
"source": [
|
| 75 |
+
"## Step 2: Define the Student (The Input)\n",
|
| 76 |
+
"\n",
|
| 77 |
+
"Everything starts with a student. Our system takes a simple narrative summary that an educator might write. This summary describes the student's challenges in plain English. \n",
|
| 78 |
"\n",
|
| 79 |
+
"Let's use the sample profile from the project description."
|
| 80 |
]
|
| 81 |
},
|
| 82 |
{
|
| 83 |
"cell_type": "code",
|
| 84 |
"execution_count": 2,
|
| 85 |
+
"id": "3784865f",
|
| 86 |
"metadata": {},
|
| 87 |
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 88 |
{
|
| 89 |
"data": {
|
| 90 |
+
"text/markdown": [
|
| 91 |
+
"**Student Query:**\n",
|
| 92 |
+
"> This student is struggling to keep up with coursework, having failed one core class and earning only 2.5 credits out of 4 credits expected for the semester. Attendance is becoming a concern at 88% for an average annual target of 90%, and they have had one behavioral incident. The student needs targeted academic and attendance support to get back on track for graduation."
|
| 93 |
+
],
|
| 94 |
"text/plain": [
|
| 95 |
+
"<IPython.core.display.Markdown object>"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
]
|
| 97 |
},
|
|
|
|
| 98 |
"metadata": {},
|
| 99 |
+
"output_type": "display_data"
|
| 100 |
}
|
| 101 |
],
|
| 102 |
"source": [
|
| 103 |
+
"from IPython.display import display, Markdown\n",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
"\n",
|
| 105 |
+
"student_profile = {\n",
|
| 106 |
+
" \"narrative_summary\": \"This student is struggling to keep up with coursework, \"\n",
|
| 107 |
+
" \"having failed one core class and earning only 2.5 credits out of 4 credits \"\n",
|
| 108 |
+
" \"expected for the semester. Attendance is becoming a concern at 88% for an average \"\n",
|
| 109 |
+
" \"annual target of 90%, and they have had one behavioral incident. \"\n",
|
| 110 |
+
" \"The student needs targeted academic and attendance support to get back on track for graduation.\"\n",
|
| 111 |
+
"}\n",
|
| 112 |
"\n",
|
| 113 |
+
"student_query = student_profile[\"narrative_summary\"]\n",
|
|
|
|
| 114 |
"\n",
|
| 115 |
+
"display(Markdown(f\"**Student Query:**\\n> {student_query}\"))"
|
| 116 |
+
]
|
| 117 |
+
},
|
| 118 |
+
{
|
| 119 |
+
"cell_type": "markdown",
|
| 120 |
+
"id": "530552a8",
|
| 121 |
+
"metadata": {},
|
| 122 |
+
"source": [
|
| 123 |
+
"## Step 3: Find Relevant Strategies (The \"Retrieval\" Step)\n",
|
| 124 |
+
"\n",
|
| 125 |
+
"Now, we take the student's story and find the most relevant strategies from our **Knowledge Base**—a curated library of best practices and proven interventions.\n",
|
| 126 |
+
"\n",
|
| 127 |
+
"How do we do this? \n",
|
| 128 |
+
"1. We've already converted our knowledge base documents into **vector embeddings** (unique digital fingerprints that capture meaning).\n",
|
| 129 |
+
"2. We use a **FAISS vector database**—a super-fast search index—to instantly find the documents with fingerprints most similar to the student's situation.\n",
|
| 130 |
+
"\n",
|
| 131 |
+
"Let's see which top 3 strategies our system retrieves for this student."
|
| 132 |
]
|
| 133 |
},
|
| 134 |
{
|
| 135 |
"cell_type": "code",
|
| 136 |
+
"execution_count": 3,
|
| 137 |
+
"id": "8cb679b9",
|
| 138 |
"metadata": {},
|
| 139 |
"outputs": [
|
| 140 |
+
{
|
| 141 |
+
"name": "stderr",
|
| 142 |
+
"output_type": "stream",
|
| 143 |
+
"text": [
|
| 144 |
+
"/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
|
| 145 |
+
" from .autonotebook import tqdm as notebook_tqdm\n"
|
| 146 |
+
]
|
| 147 |
+
},
|
| 148 |
{
|
| 149 |
"name": "stdout",
|
| 150 |
"output_type": "stream",
|
|
|
|
| 158 |
"name": "stderr",
|
| 159 |
"output_type": "stream",
|
| 160 |
"text": [
|
| 161 |
+
"Batches: 0%| | 0/1 [00:00<?, ?it/s]/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1520: FutureWarning: `encoder_attention_mask` is deprecated and will be removed in version 4.55.0 for `BertSdpaSelfAttention.forward`.\n",
|
| 162 |
" return forward_call(*args, **kwargs)\n",
|
| 163 |
+
"Batches: 100%|████████████████████████████████████| 1/1 [00:02<00:00, 2.08s/it]\n"
|
| 164 |
]
|
| 165 |
},
|
| 166 |
{
|
|
|
|
| 169 |
"text": [
|
| 170 |
"Embeddings created successfully.\n",
|
| 171 |
"Creating FAISS index with dimension 384...\n",
|
| 172 |
+
"FAISS index created with 27 vectors.\n",
|
| 173 |
+
"\n",
|
| 174 |
+
"Searching for top 3 interventions for query: 'This student is struggling to keep up with coursework, having failed one core cl...'\n",
|
| 175 |
+
"Found 3 relevant interventions.\n"
|
| 176 |
]
|
| 177 |
},
|
| 178 |
{
|
| 179 |
+
"data": {
|
| 180 |
+
"text/markdown": [
|
| 181 |
+
"**Top 3 Retrieved Strategies:**"
|
| 182 |
+
],
|
| 183 |
+
"text/plain": [
|
| 184 |
+
"<IPython.core.display.Markdown object>"
|
| 185 |
+
]
|
| 186 |
+
},
|
| 187 |
+
"metadata": {},
|
| 188 |
+
"output_type": "display_data"
|
| 189 |
+
},
|
| 190 |
+
{
|
| 191 |
+
"data": {
|
| 192 |
+
"text/markdown": [
|
| 193 |
+
"- **Strategy: Differentiating Intervention Tiers** (Source: *NCS_OTToolkit_2ndEd_October_2017_updated.pdf*, Relevance: 0.57)"
|
| 194 |
+
],
|
| 195 |
+
"text/plain": [
|
| 196 |
+
"<IPython.core.display.Markdown object>"
|
| 197 |
+
]
|
| 198 |
+
},
|
| 199 |
+
"metadata": {},
|
| 200 |
+
"output_type": "display_data"
|
| 201 |
+
},
|
| 202 |
+
{
|
| 203 |
+
"data": {
|
| 204 |
+
"text/markdown": [
|
| 205 |
+
"- **Tool: Intervention Tracking** (Source: *NCS_OTToolkit_2ndEd_October_2017_updated.pdf*, Relevance: 0.54)"
|
| 206 |
+
],
|
| 207 |
+
"text/plain": [
|
| 208 |
+
"<IPython.core.display.Markdown object>"
|
| 209 |
+
]
|
| 210 |
+
},
|
| 211 |
+
"metadata": {},
|
| 212 |
+
"output_type": "display_data"
|
| 213 |
+
},
|
| 214 |
+
{
|
| 215 |
+
"data": {
|
| 216 |
+
"text/markdown": [
|
| 217 |
+
"- **Tool: BAG Report (Example)** (Source: *NCS_OTToolkit_2ndEd_October_2017_updated.pdf*, Relevance: 0.53)"
|
| 218 |
+
],
|
| 219 |
+
"text/plain": [
|
| 220 |
+
"<IPython.core.display.Markdown object>"
|
| 221 |
+
]
|
| 222 |
+
},
|
| 223 |
+
"metadata": {},
|
| 224 |
+
"output_type": "display_data"
|
| 225 |
}
|
| 226 |
],
|
| 227 |
"source": [
|
| 228 |
+
"# Import the necessary functions from our project's code\n",
|
| 229 |
+
"from fot_recommender.rag_pipeline import (\n",
|
| 230 |
+
" load_knowledge_base,\n",
|
| 231 |
+
" initialize_embedding_model,\n",
|
| 232 |
+
" create_embeddings,\n",
|
| 233 |
+
" create_vector_db,\n",
|
| 234 |
+
" search_interventions,\n",
|
| 235 |
+
" generate_recommendation_summary\n",
|
| 236 |
+
")\n",
|
| 237 |
+
"from fot_recommender.utils import display_recommendations\n",
|
| 238 |
+
"\n",
|
| 239 |
+
"# --- Load all the components of our RAG system ---\n",
|
| 240 |
+
"\n",
|
| 241 |
+
"# 1. Load the chunked knowledge base\n",
|
| 242 |
+
"kb_path = project_path / \"data\" / \"processed\" / \"knowledge_base_final_chunks.json\"\n",
|
| 243 |
+
"knowledge_base_chunks = load_knowledge_base(str(kb_path))\n",
|
| 244 |
+
"\n",
|
| 245 |
+
"# 2. Initialize the embedding model\n",
|
| 246 |
"embedding_model = initialize_embedding_model()\n",
|
| 247 |
"\n",
|
| 248 |
+
"# 3. Create embeddings and the vector database\n",
|
| 249 |
"embeddings = create_embeddings(knowledge_base_chunks, embedding_model)\n",
|
| 250 |
+
"vector_db = create_vector_db(embeddings)\n",
|
| 251 |
+
"\n",
|
| 252 |
+
"# --- Perform the search! ---\n",
|
| 253 |
+
"retrieved_interventions = search_interventions(\n",
|
| 254 |
+
" query=student_query,\n",
|
| 255 |
+
" model=embedding_model,\n",
|
| 256 |
+
" index=vector_db,\n",
|
| 257 |
+
" knowledge_base=knowledge_base_chunks,\n",
|
| 258 |
+
" k=3,\n",
|
| 259 |
+
" min_similarity_score=0.4\n",
|
| 260 |
+
")\n",
|
| 261 |
"\n",
|
| 262 |
+
"# Display the titles of what we found\n",
|
| 263 |
+
"display(Markdown(\"**Top 3 Retrieved Strategies:**\"))\n",
|
| 264 |
+
"for chunk, score in retrieved_interventions:\n",
|
| 265 |
+
" display(Markdown(f\"- **{chunk['title']}** (Source: *{chunk['source_document']}*, Relevance: {score:.2f})\"))"
|
| 266 |
]
|
| 267 |
},
|
| 268 |
{
|
| 269 |
"cell_type": "markdown",
|
| 270 |
+
"id": "2202209d",
|
| 271 |
"metadata": {},
|
| 272 |
"source": [
|
| 273 |
+
"## Step 4: Create the Recommendation (The \"Generation\" Step)\n",
|
| 274 |
"\n",
|
| 275 |
+
"Finding the right documents is only half the battle. Raw research isn't very helpful to a busy teacher. \n",
|
| 276 |
"\n",
|
| 277 |
+
"In this final step, we use a powerful Large Language Model (Google's Gemini API) to act as an expert instructional coach. We give it the student's story and the relevant strategies we just retrieved. The AI's job is to **synthesize** this information into a concise, practical recommendation tailored specifically for a teacher.\n",
|
|
|
|
|
|
|
|
|
|
| 278 |
"\n",
|
| 279 |
+
"This is the final output of our system."
|
| 280 |
]
|
| 281 |
},
|
| 282 |
{
|
| 283 |
"cell_type": "code",
|
| 284 |
+
"execution_count": 4,
|
| 285 |
+
"id": "62ee35bc",
|
| 286 |
"metadata": {},
|
| 287 |
"outputs": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 288 |
{
|
| 289 |
"name": "stdout",
|
| 290 |
"output_type": "stream",
|
| 291 |
"text": [
|
| 292 |
"\n",
|
| 293 |
+
"Synthesizing recommendation for persona: 'teacher' using Gemini...\n",
|
| 294 |
+
"Synthesis complete.\n"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 295 |
]
|
| 296 |
+
},
|
| 297 |
+
{
|
| 298 |
+
"data": {
|
| 299 |
+
"text/markdown": [
|
| 300 |
+
"### Final Synthesized Recommendation for the Teacher"
|
| 301 |
+
],
|
| 302 |
+
"text/plain": [
|
| 303 |
+
"<IPython.core.display.Markdown object>"
|
| 304 |
+
]
|
| 305 |
+
},
|
| 306 |
+
"metadata": {},
|
| 307 |
+
"output_type": "display_data"
|
| 308 |
+
},
|
| 309 |
+
{
|
| 310 |
+
"data": {
|
| 311 |
+
"text/markdown": [
|
| 312 |
+
"This student is experiencing academic difficulty, reflected in a 2.5 GPA and a failing grade in one core class, coupled with attendance concerns (88% attendance versus a 90% target) and one behavioral incident. To address these challenges and support the student's path to graduation, the following interventions are recommended:\n",
|
| 313 |
+
"\n",
|
| 314 |
+
"\n",
|
| 315 |
+
"**1. Implement a Tiered Intervention Strategy:** Determine the extent to which attendance is contributing to the student's academic struggles. (\"Strategy: Differentiating Intervention Tiers\"). If attendance is a significant factor, refer the student to the appropriate support services (Success Team or Attendance Dean, as indicated by the BAG Report format) to address these issues directly. This allows more focused support from the teaching staff for academic interventions.\n",
|
| 316 |
+
"\n",
|
| 317 |
+
"**2. Utilize a Robust Intervention Tracking System:** Implement a system to monitor the student's progress, focusing on attendance, GPA, and behavior. (\"Tool: Intervention Tracking\"). This system should clearly document interventions (e.g., tutoring sessions, mentorship meetings), and track the student’s progress in each core course (GPA and attendance rates) at two checkpoints within a ten-week period. The \"BAG Report\" format provides a useful template to track behavior, attendance and grades. This data will inform adjustments to the support plan.\n",
|
| 318 |
+
"\n",
|
| 319 |
+
"**3. Regularly Review the Student's \"BAG Report\" (or Equivalent):** Use a reporting mechanism (such as the BAG report example) to regularly review the student's performance across all three key areas: Behavior, Attendance, and Grades. This visual representation highlights areas of strength and areas requiring immediate intervention, allowing for proactive adjustments to support strategies. This aligns with the recommendation to monitor multiple key performance indicators to improve student outcomes effectively.\n"
|
| 320 |
+
],
|
| 321 |
+
"text/plain": [
|
| 322 |
+
"<IPython.core.display.Markdown object>"
|
| 323 |
+
]
|
| 324 |
+
},
|
| 325 |
+
"metadata": {},
|
| 326 |
+
"output_type": "display_data"
|
| 327 |
}
|
| 328 |
],
|
| 329 |
"source": [
|
| 330 |
+
"from dotenv import load_dotenv\n",
|
|
|
|
|
|
|
|
|
|
| 331 |
"\n",
|
| 332 |
+
"# Load the API key from a .env file (if it exists)\n",
|
| 333 |
+
"load_dotenv(project_path / '.env') \n",
|
| 334 |
+
"api_key = os.getenv(\"FOT_GOOGLE_API_KEY\")\n",
|
| 335 |
"\n",
|
| 336 |
+
"if not api_key:\n",
|
| 337 |
+
" print(\"✋ FOT_GOOGLE_API_KEY not found. Please provide your Google API key to generate the summary.\")\n",
|
| 338 |
+
" final_recommendation = \"(API Key not provided - could not generate summary)\"\n",
|
| 339 |
+
"else:\n",
|
| 340 |
+
" final_recommendation = generate_recommendation_summary(\n",
|
| 341 |
+
" retrieved_chunks=retrieved_interventions,\n",
|
| 342 |
+
" student_narrative=student_query,\n",
|
| 343 |
+
" api_key=api_key,\n",
|
| 344 |
+
" persona=\"teacher\"\n",
|
| 345 |
" )\n",
|
| 346 |
"\n",
|
| 347 |
+
"display(Markdown(\"### Final Synthesized Recommendation for the Teacher\"))\n",
|
| 348 |
+
"display(Markdown(final_recommendation))"
|
|
|
|
|
|
|
|
|
|
| 349 |
]
|
| 350 |
},
|
| 351 |
{
|
| 352 |
"cell_type": "markdown",
|
| 353 |
+
"id": "d3718297",
|
| 354 |
"metadata": {},
|
| 355 |
+
"source": [
|
| 356 |
+
"## Bonus: See the Evidence\n",
|
| 357 |
+
"\n",
|
| 358 |
+
"The recommendation above isn't just made up—it's directly grounded in the documents we retrieved. Here are the specific text snippets that the AI used to create its summary. This ensures our recommendations are always transparent and evidence-based."
|
| 359 |
+
]
|
| 360 |
},
|
| 361 |
{
|
| 362 |
"cell_type": "code",
|
| 363 |
+
"execution_count": 5,
|
| 364 |
+
"id": "1b0cb720",
|
| 365 |
"metadata": {},
|
| 366 |
+
"outputs": [
|
| 367 |
+
{
|
| 368 |
+
"name": "stdout",
|
| 369 |
+
"output_type": "stream",
|
| 370 |
+
"text": [
|
| 371 |
+
"\n",
|
| 372 |
+
"--- Top Recommended Interventions ---\n",
|
| 373 |
+
"\n",
|
| 374 |
+
"--- Recommendation 1 (Similarity Score: 0.5735) ---\n",
|
| 375 |
+
" Title: Strategy: Differentiating Intervention Tiers\n",
|
| 376 |
+
" Source: NCS_OTToolkit_2ndEd_October_2017_updated.pdf (Pages: 46)\n",
|
| 377 |
+
" \n",
|
| 378 |
+
" Content Snippet:\n",
|
| 379 |
+
" \"To what degree is attendance playing a role in student performance? To whom do you refer Tier 3 students who have serious attendance issues (inside and outside of the school) so that the Success Team can really concentrate on supporting Tier 2 students?...\"\n",
|
| 380 |
+
"--------------------------------------------------\n",
|
| 381 |
+
"\n",
|
| 382 |
+
"--- Recommendation 2 (Similarity Score: 0.5416) ---\n",
|
| 383 |
+
" Title: Tool: Intervention Tracking\n",
|
| 384 |
+
" Source: NCS_OTToolkit_2ndEd_October_2017_updated.pdf (Pages: 49)\n",
|
| 385 |
+
" \n",
|
| 386 |
+
" Content Snippet:\n",
|
| 387 |
+
" \"Features of Good Intervention Tracking Tools:\n",
|
| 388 |
+
" • Name of the intervention and what key performance indicator it addresses (attendance, point-in-time On-Track rates, GPA, behavior metric, etc.)\n",
|
| 389 |
+
" • Names of the targeted students\n",
|
| 390 |
+
" ° If tracking grades, include each core course's average expressed as a percentage\n",
|
| 391 |
+
" • Intervention contacts/implementation evidence\n",
|
| 392 |
+
" ° Tutoring attendance\n",
|
| 393 |
+
" ° Mentorship contact dates\n",
|
| 394 |
+
" ° \"Office hours\" visits\n",
|
| 395 |
+
" • Point-in-time progress on the key performance...\"\n",
|
| 396 |
+
"--------------------------------------------------\n",
|
| 397 |
+
"\n",
|
| 398 |
+
"--- Recommendation 3 (Similarity Score: 0.5328) ---\n",
|
| 399 |
+
" Title: Tool: BAG Report (Example)\n",
|
| 400 |
+
" Source: NCS_OTToolkit_2ndEd_October_2017_updated.pdf (Pages: 61)\n",
|
| 401 |
+
" \n",
|
| 402 |
+
" Content Snippet:\n",
|
| 403 |
+
" \"Student: Keith\n",
|
| 404 |
+
" Grade Level: 9\n",
|
| 405 |
+
" 8th Period Teacher: Donson\n",
|
| 406 |
+
" The numbers below reflect totals through Semester 1\n",
|
| 407 |
+
" \n",
|
| 408 |
+
" BEHAVIOR - In what ways do I contribute to a Safe and Respectful school climate?\n",
|
| 409 |
+
" • # of Infractions (# of Major Infractions): 5 (1)\n",
|
| 410 |
+
" • # of Days of In-School-Suspension (ISS): 10\n",
|
| 411 |
+
" • # of Days of Out-of-School-Suspension (OSS): 0\n",
|
| 412 |
+
" If I have any questions regarding my misconducts, I should schedule an appointment with the Dean of Discipline.\n",
|
| 413 |
+
" \n",
|
| 414 |
+
" ATTENDANCE - Do my actions r...\"\n",
|
| 415 |
+
"--------------------------------------------------\n"
|
| 416 |
+
]
|
| 417 |
+
}
|
| 418 |
+
],
|
| 419 |
+
"source": [
|
| 420 |
+
"display_recommendations(retrieved_interventions)"
|
| 421 |
+
]
|
| 422 |
},
|
| 423 |
{
|
| 424 |
"cell_type": "markdown",
|
| 425 |
+
"id": "254d4cdf",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 426 |
"metadata": {},
|
|
|
|
| 427 |
"source": [
|
| 428 |
+
"## Explore the Live Demo!\n",
|
|
|
|
| 429 |
"\n",
|
| 430 |
+
"You've seen the step-by-step process of how our RAG system turns a student's story into an actionable, evidence-based plan. Now, it's time to try it yourself with any student scenario you can imagine!\n",
|
|
|
|
| 431 |
"\n",
|
| 432 |
+
"We have deployed this entire system as an interactive web application on Hugging Face Spaces. Click the link below to access the live demo—no setup or API key required.\n",
|
|
|
|
| 433 |
"\n",
|
|
|
|
|
|
|
|
|
|
|
|
|
| 434 |
"\n",
|
| 435 |
+
"#### [👉 Click Here to Launch the Live FOT Recommender API](https://huggingface.co/spaces/chuckfinca/fot-recommender-api)\n"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 436 |
]
|
| 437 |
},
|
| 438 |
{
|
| 439 |
"cell_type": "code",
|
| 440 |
"execution_count": null,
|
| 441 |
+
"id": "64867bc5-2762-4c69-aa72-e4e7cf911019",
|
| 442 |
"metadata": {},
|
| 443 |
"outputs": [],
|
| 444 |
"source": []
|