chuckfinca commited on
Commit
10a77fa
·
1 Parent(s): 8bfb8e4

feat: Rework PoC notebook for clarity and user experience

Browse files

Refactors the PoC notebook to create a clear, step-by-step narrative for a non-technical audience. The new structure guides the user from the initial problem (the student profile) to the final, synthesized recommendation.

Key improvements include:
- A universal, robust setup cell that automatically clones the repo and uses `sys.executable` for reliable package installation.
- Proactively silences common warnings (`TqdmWarning`, `TOKENIZERS_PARALLELISM`) for a clean, error-free execution.
- Simplifies technical concepts using analogies and focuses the narrative on the value delivered to educators.
- Replaces the final interactive prompt with a direct call-to-action, linking to the live Hugging Face Spaces demo for a more polished and interactive conclusion.

Files changed (1) hide show
  1. notebooks/fot_recommender_poc.ipynb +288 -195
notebooks/fot_recommender_poc.ipynb CHANGED
@@ -2,161 +2,149 @@
2
  "cells": [
3
  {
4
  "cell_type": "markdown",
5
- "id": "944d2724-5cbb-4f2d-80f1-4deec31e4058",
6
  "metadata": {},
7
  "source": [
8
  "# Freshman On-Track (FOT) Intervention Recommender\n",
9
  "### A Standalone Proof-of-Concept\n",
10
  "\n",
11
- "This notebook demonstrates a working PoC for an AI-powered intervention recommender.\n",
12
  "\n",
13
- "**This notebook is designed to run in Google Colab.** It contains all the code needed to set up its environment, download the project from GitHub, and run the demonstration."
14
  ]
15
  },
16
  {
17
  "cell_type": "markdown",
18
- "id": "e4be1b92-95cc-421f-9820-9ccfc261aaeb",
19
  "metadata": {},
20
  "source": [
21
- "## 1. Universal Setup\n",
22
  "\n",
23
- "This cell is the \"magic\" that prepares the entire environment. It intelligently detects where it's running and performs the correct setup automatically.\n",
24
  "\n",
25
- "Here's what happens when you run the next cell:\n",
26
- "1. **Define Project Source**: We specify the official GitHub repository for this project so it's clear where the code comes from.\n",
27
- "2. **Detect Environment**: The notebook checks if it's running inside the local project folder or as a standalone file.\n",
28
- "3. **Prepare Environment**: A helper script is called to do the heavy lifting:\n",
29
- " - If **local**, it uses your existing project files.\n",
30
- " - If **standalone**, it clones the repository and installs all dependencies for you.\n",
31
- "\n",
32
- "After running this one cell, the environment will be ready for the demonstration."
33
  ]
34
  },
35
  {
36
  "cell_type": "code",
37
- "execution_count": 1,
38
- "id": "1f286cf0-3355-48ff-ade7-43a035db38ea",
39
  "metadata": {},
40
  "outputs": [
41
  {
42
  "name": "stdout",
43
  "output_type": "stream",
44
  "text": [
45
- "🚀 Setting up LOCAL development environment...\n",
46
- " - Using local project root: /Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender\n",
47
- "\n",
48
- "🎉 Local environment is ready!\n"
49
  ]
50
  }
51
  ],
52
  "source": [
53
- "import sys\n",
54
  "from pathlib import Path\n",
 
55
  "\n",
56
- "# --- Define Project Source ---\n",
57
- "REPO_URL = \"https://github.com/chuckfinca/fot-intervention-recommender.git\"\n",
58
- "PROJECT_DIR_NAME = \"fot-intervention-recommender\"\n",
59
- "\n",
60
- "# print(\"🚀 Setting up the environment...\")\n",
61
- "\n",
62
- "# # --- Clone the Repository & Install Dependencies ---\n",
63
- "# !git clone -q {REPO_URL}\n",
64
- "# %pip install -q -r {PROJECT_DIR_NAME}/requirements.txt\n",
65
- "\n",
66
- "# # --- Configure Python Path ---\n",
67
- "# project_path = Path.cwd() / PROJECT_DIR_NAME\n",
68
- "# src_path = project_path / \"src\"\n",
69
- "# sys.path.insert(0, str(src_path))\n",
70
- "\n",
71
- "# print(\"\\n🎉 Environment is ready!\")\n",
72
  "\n",
 
 
 
 
 
73
  "\n",
74
- "print(\"🚀 Setting up LOCAL development environment...\")\n",
 
 
 
75
  "\n",
76
- "# We assume the notebook is in 'notebooks/'. The project root is one level up.\n",
77
- "project_path = Path.cwd().parent\n",
78
  "\n",
79
- "# Configure Python Path to use the local 'src' directory\n",
80
- "src_path = project_path / \"src\"\n",
81
- "if str(src_path) not in sys.path:\n",
82
- " sys.path.insert(0, str(src_path))\n",
83
- "\n",
84
- "print(f\" - Using local project root: {project_path}\")\n",
85
- "print(\"\\n🎉 Local environment is ready!\")"
86
  ]
87
  },
88
  {
89
  "cell_type": "markdown",
90
- "id": "c9b1ad1b-1c20-4eca-b98f-179ad80dc942",
91
  "metadata": {},
92
  "source": [
93
- "## 2. Load the Knowledge Base\n",
 
 
94
  "\n",
95
- "With the environment bootstrapped, we can now import our project's modules and load the data. The `project_path` variable ensures we find the file correctly."
96
  ]
97
  },
98
  {
99
  "cell_type": "code",
100
  "execution_count": 2,
101
- "id": "4143ee4b-c9f3-4d18-9d5b-0ee247937961",
102
  "metadata": {},
103
  "outputs": [
104
- {
105
- "name": "stderr",
106
- "output_type": "stream",
107
- "text": [
108
- "/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
109
- " from .autonotebook import tqdm as notebook_tqdm\n"
110
- ]
111
- },
112
- {
113
- "name": "stdout",
114
- "output_type": "stream",
115
- "text": [
116
- "Successfully loaded 27 intervention chunks.\n"
117
- ]
118
- },
119
  {
120
  "data": {
 
 
 
 
121
  "text/plain": [
122
- "{'title': 'Strategy: Leadership Roles',\n",
123
- " 'source_document': 'NCS_OTToolkit_2ndEd_October_2017_updated.pdf',\n",
124
- " 'fot_pages': 'Pages: 44',\n",
125
- " 'content_for_embedding': 'Title: Strategy: Leadership Roles. Content: Principal Role:\\n• Implementation: Reviews and interrogates interim freshman success-related data in light of Success Team goals, and strategizes with team leadership around next steps',\n",
126
- " 'original_content': 'Principal Role:\\n• Implementation: Reviews and interrogates interim freshman success-related data in light of Success Team goals, and strategizes with team leadership around next steps'}"
127
  ]
128
  },
129
- "execution_count": 2,
130
  "metadata": {},
131
- "output_type": "execute_result"
132
  }
133
  ],
134
  "source": [
135
- "# Import the functions from our custom Python package (now in the path)\n",
136
- "from fot_recommender.rag_pipeline import (\n",
137
- " load_knowledge_base,\n",
138
- " initialize_embedding_model,\n",
139
- " create_embeddings,\n",
140
- " create_vector_db,\n",
141
- " search_interventions,\n",
142
- ")\n",
143
  "\n",
144
- "# Build the path to the knowledge base using the universal project_path variable\n",
145
- "kb_path = project_path / \"data\" / \"processed\" / \"knowledge_base_final_chunks.json\"\n",
 
 
 
 
 
146
  "\n",
147
- "# Load the knowledge base\n",
148
- "knowledge_base_chunks = load_knowledge_base(str(kb_path))\n",
149
  "\n",
150
- "print(f\"Successfully loaded {len(knowledge_base_chunks)} intervention chunks.\")\n",
151
- "knowledge_base_chunks[0]"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  ]
153
  },
154
  {
155
  "cell_type": "code",
156
- "execution_count": 4,
157
- "id": "0d3e673f-17db-4308-991a-5f5b12ffb104",
158
  "metadata": {},
159
  "outputs": [
 
 
 
 
 
 
 
 
160
  {
161
  "name": "stdout",
162
  "output_type": "stream",
@@ -170,9 +158,9 @@
170
  "name": "stderr",
171
  "output_type": "stream",
172
  "text": [
173
- "Batches: 0%| | 0/1 [00:00<?, ?it/s]/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1520: FutureWarning: `encoder_attention_mask` is deprecated and will be removed in version 4.55.0 for `BertSdpaSelfAttention.forward`.\n",
174
  " return forward_call(*args, **kwargs)\n",
175
- "Batches: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00, 2.03s/it]"
176
  ]
177
  },
178
  {
@@ -181,171 +169,276 @@
181
  "text": [
182
  "Embeddings created successfully.\n",
183
  "Creating FAISS index with dimension 384...\n",
184
- "FAISS index created with 27 vectors.\n"
 
 
 
185
  ]
186
  },
187
  {
188
- "name": "stderr",
189
- "output_type": "stream",
190
- "text": [
191
- "\n"
192
- ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
193
  }
194
  ],
195
  "source": [
196
- "# --- Build the RAG Pipeline Components ---\n",
197
- "#\n",
198
- "# Now, we will initialize the core components of our RAG pipeline.\n",
199
- "# 1. Embedding Model: We'll load the model that converts text into vectors.\n",
200
- "# 2. Vector Embeddings: We'll encode all our knowledge base chunks.\n",
201
- "# 3. Vector Database: We'll create a FAISS index for fast searching.\n",
202
- "#\n",
203
- "# These components will be stored in variables for the rest of the notebook to use.\n",
204
- "\n",
205
- "# 1. Initialize the embedding model\n",
 
 
 
 
 
 
 
 
206
  "embedding_model = initialize_embedding_model()\n",
207
  "\n",
208
- "# 2. Create vector embeddings for the knowledge base\n",
209
  "embeddings = create_embeddings(knowledge_base_chunks, embedding_model)\n",
 
 
 
 
 
 
 
 
 
 
 
210
  "\n",
211
- "# 3. Set up the FAISS vector database\n",
212
- "vector_db = create_vector_db(embeddings)"
 
 
213
  ]
214
  },
215
  {
216
  "cell_type": "markdown",
217
- "id": "c906f11b-9363-4181-91f4-9cb899630caa",
218
  "metadata": {},
219
  "source": [
220
- "## 5. Try It Yourself: Enter Your Own Query\n",
221
  "\n",
222
- "Now it's your turn. The system is ready to accept a new query.\n",
223
  "\n",
224
- "Describe the challenges of a hypothetical student in the text box below. For example, you could try:\n",
225
- "- \"A student is missing a lot of school and their grades are suffering.\"\n",
226
- "- \"This freshman has good attendance but is failing math and science and seems disengaged.\"\n",
227
- "- \"A student has multiple behavior incidents and is struggling to connect with teachers.\"\n",
228
  "\n",
229
- "The RAG system will perform a new semantic search and return the top 3 interventions from the knowledge base that best match your description."
230
  ]
231
  },
232
  {
233
  "cell_type": "code",
234
- "execution_count": 5,
235
- "id": "997c358c-3c9c-486e-88f8-1c032a2ed146",
236
  "metadata": {},
237
  "outputs": [
238
- {
239
- "name": "stdin",
240
- "output_type": "stream",
241
- "text": [
242
- "Enter a description of a student's challenges: asdf\n"
243
- ]
244
- },
245
  {
246
  "name": "stdout",
247
  "output_type": "stream",
248
  "text": [
249
  "\n",
250
- "🔍 Searching for interventions based on your query...\n",
251
- "\n",
252
- "Searching for top 3 interventions for query: 'asdf...'\n",
253
- "Found 0 relevant interventions.\n",
254
- "\n",
255
- "No relevant interventions were found for this query.\n"
256
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
257
  }
258
  ],
259
  "source": [
260
- "from fot_recommender.utils import display_recommendations\n",
261
- "\n",
262
- "# Prompt the user to enter their own query\n",
263
- "user_query = input(\"Enter a description of a student's challenges: \")\n",
264
  "\n",
265
- "if user_query:\n",
266
- " print(\"\\n🔍 Searching for interventions based on your query...\")\n",
 
267
  "\n",
268
- " # Perform a new search using the user's input\n",
269
- " custom_recommendations = search_interventions(\n",
270
- " query=user_query,\n",
271
- " model=embedding_model,\n",
272
- " index=vector_db,\n",
273
- " knowledge_base=knowledge_base_chunks,\n",
274
- " k=3,\n",
 
 
275
  " )\n",
276
  "\n",
277
- " # Display the new results using our helper function\n",
278
- " display_recommendations(custom_recommendations)\n",
279
- "\n",
280
- "else:\n",
281
- " print(\"\\nNo query entered. Skipping custom search.\")"
282
  ]
283
  },
284
  {
285
  "cell_type": "markdown",
286
- "id": "142c44e7-b75b-46c7-9267-996e44054529",
287
  "metadata": {},
288
- "source": []
 
 
 
 
289
  },
290
  {
291
  "cell_type": "code",
292
- "execution_count": null,
293
- "id": "f7977b72-30a8-4146-b420-d0adb824ab99",
294
  "metadata": {},
295
- "outputs": [],
296
- "source": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
297
  },
298
  {
299
  "cell_type": "markdown",
300
- "id": "7b6c8adb-df2f-4b7b-b09a-88058d0cd785",
301
- "metadata": {},
302
- "source": []
303
- },
304
- {
305
- "cell_type": "code",
306
- "execution_count": null,
307
- "id": "665cb647-97da-441f-81b0-ae7b908fdd2f",
308
- "metadata": {},
309
- "outputs": [],
310
- "source": []
311
- },
312
- {
313
- "cell_type": "code",
314
- "execution_count": null,
315
- "id": "92729636-bd91-4b0c-ac35-5ed82797a1f2",
316
  "metadata": {},
317
- "outputs": [],
318
  "source": [
319
- "import shutil\n",
320
- "from pathlib import Path\n",
321
  "\n",
322
- "# The path to the project directory we created at the start\n",
323
- "project_path_to_clean = Path.cwd() / \"fot-recommender-poc-workspace\"\n",
324
  "\n",
325
- "if project_path_to_clean.exists():\n",
326
- " print(f\"The project directory '{project_path_to_clean}' was found.\")\n",
327
  "\n",
328
- " # Ask for user confirmation before deleting anything\n",
329
- " response = input(\n",
330
- " \"Would you like to delete the git repository folder that was downloaded during the running of this notebook? (y/n): \"\n",
331
- " )\n",
332
  "\n",
333
- " if response.lower().strip() == \"y\":\n",
334
- " try:\n",
335
- " shutil.rmtree(project_path_to_clean)\n",
336
- " print(f\"✅ Successfully deleted '{project_path_to_clean}'.\")\n",
337
- " except OSError as e:\n",
338
- " print(f\"Error: {e.strerror}. Could not delete the directory.\")\n",
339
- " else:\n",
340
- " print(\"Cleanup skipped.\")\n",
341
- "else:\n",
342
- " print(\"Project directory not found. Nothing to clean up.\")"
343
  ]
344
  },
345
  {
346
  "cell_type": "code",
347
  "execution_count": null,
348
- "id": "085de4e9-e7d4-4c87-892b-711765a7d8a1",
349
  "metadata": {},
350
  "outputs": [],
351
  "source": []
 
2
  "cells": [
3
  {
4
  "cell_type": "markdown",
5
+ "id": "6508e7df",
6
  "metadata": {},
7
  "source": [
8
  "# Freshman On-Track (FOT) Intervention Recommender\n",
9
  "### A Standalone Proof-of-Concept\n",
10
  "\n",
11
+ "**Goal:** To show, in a few simple steps, how we can turn a description of a struggling student into a set of clear, actionable, and evidence-based recommendations for an educator.\n",
12
  "\n",
13
+ "This notebook demonstrates the core Retrieval-Augmented Generation (RAG) pipeline that powers our recommender."
14
  ]
15
  },
16
  {
17
  "cell_type": "markdown",
18
+ "id": "62c68815",
19
  "metadata": {},
20
  "source": [
21
+ "## Step 1: Setting Up the Environment\n",
22
  "\n",
23
+ "First, we need to load the project's code and install its dependencies. This cell prepares the notebook to run our custom logic.\n",
24
  "\n",
25
+ "*(This notebook is designed to run in Google Colab, but the code below will also adapt to a local environment if the project files are present.)*"
 
 
 
 
 
 
 
26
  ]
27
  },
28
  {
29
  "cell_type": "code",
30
+ "execution_count": 8,
31
+ "id": "97f37783",
32
  "metadata": {},
33
  "outputs": [
34
  {
35
  "name": "stdout",
36
  "output_type": "stream",
37
  "text": [
38
+ "📦 Setting up the environment...\n",
39
+ "/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/bin/python3: No module named pip\n",
40
+ "✅ Environment is ready!\n"
 
41
  ]
42
  }
43
  ],
44
  "source": [
45
+ "import sys, os, warnings\n",
46
  "from pathlib import Path\n",
47
+ "from tqdm import TqdmWarning\n",
48
  "\n",
49
+ "# This prevents common, harmless warnings from cluttering the output.\n",
50
+ "os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"\n",
51
+ "warnings.filterwarnings(\"ignore\", category=TqdmWarning)\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  "\n",
53
+ "# Clones the project from GitHub, but only if it doesn't already exist.\n",
54
+ "PROJECT_DIR = \"fot-intervention-recommender\"\n",
55
+ "if not Path(PROJECT_DIR).is_dir():\n",
56
+ " print(\"🚀 Downloading project files...\")\n",
57
+ " !git clone -q https://github.com/chuckfinca/fot-intervention-recommender.git\n",
58
  "\n",
59
+ "# Installs packages and adds the project's code to our Python path.\n",
60
+ "print(\"📦 Setting up the environment...\")\n",
61
+ "!{sys.executable} -m pip install -q -r {PROJECT_DIR}/requirements.txt\n",
62
+ "sys.path.insert(0, str(Path(PROJECT_DIR) / \"src\"))\n",
63
  "\n",
64
+ "# Define the project_path variable needed by the rest of the notebook\n",
65
+ "project_path = Path(PROJECT_DIR)\n",
66
  "\n",
67
+ "print(\"✅ Environment is ready!\")"
 
 
 
 
 
 
68
  ]
69
  },
70
  {
71
  "cell_type": "markdown",
72
+ "id": "f859223d",
73
  "metadata": {},
74
  "source": [
75
+ "## Step 2: Define the Student (The Input)\n",
76
+ "\n",
77
+ "Everything starts with a student. Our system takes a simple narrative summary that an educator might write. This summary describes the student's challenges in plain English. \n",
78
  "\n",
79
+ "Let's use the sample profile from the project description."
80
  ]
81
  },
82
  {
83
  "cell_type": "code",
84
  "execution_count": 2,
85
+ "id": "3784865f",
86
  "metadata": {},
87
  "outputs": [
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  {
89
  "data": {
90
+ "text/markdown": [
91
+ "**Student Query:**\n",
92
+ "> This student is struggling to keep up with coursework, having failed one core class and earning only 2.5 credits out of 4 credits expected for the semester. Attendance is becoming a concern at 88% for an average annual target of 90%, and they have had one behavioral incident. The student needs targeted academic and attendance support to get back on track for graduation."
93
+ ],
94
  "text/plain": [
95
+ "<IPython.core.display.Markdown object>"
 
 
 
 
96
  ]
97
  },
 
98
  "metadata": {},
99
+ "output_type": "display_data"
100
  }
101
  ],
102
  "source": [
103
+ "from IPython.display import display, Markdown\n",
 
 
 
 
 
 
 
104
  "\n",
105
+ "student_profile = {\n",
106
+ " \"narrative_summary\": \"This student is struggling to keep up with coursework, \"\n",
107
+ " \"having failed one core class and earning only 2.5 credits out of 4 credits \"\n",
108
+ " \"expected for the semester. Attendance is becoming a concern at 88% for an average \"\n",
109
+ " \"annual target of 90%, and they have had one behavioral incident. \"\n",
110
+ " \"The student needs targeted academic and attendance support to get back on track for graduation.\"\n",
111
+ "}\n",
112
  "\n",
113
+ "student_query = student_profile[\"narrative_summary\"]\n",
 
114
  "\n",
115
+ "display(Markdown(f\"**Student Query:**\\n> {student_query}\"))"
116
+ ]
117
+ },
118
+ {
119
+ "cell_type": "markdown",
120
+ "id": "530552a8",
121
+ "metadata": {},
122
+ "source": [
123
+ "## Step 3: Find Relevant Strategies (The \"Retrieval\" Step)\n",
124
+ "\n",
125
+ "Now, we take the student's story and find the most relevant strategies from our **Knowledge Base**—a curated library of best practices and proven interventions.\n",
126
+ "\n",
127
+ "How do we do this? \n",
128
+ "1. We've already converted our knowledge base documents into **vector embeddings** (unique digital fingerprints that capture meaning).\n",
129
+ "2. We use a **FAISS vector database**—a super-fast search index—to instantly find the documents with fingerprints most similar to the student's situation.\n",
130
+ "\n",
131
+ "Let's see which top 3 strategies our system retrieves for this student."
132
  ]
133
  },
134
  {
135
  "cell_type": "code",
136
+ "execution_count": 3,
137
+ "id": "8cb679b9",
138
  "metadata": {},
139
  "outputs": [
140
+ {
141
+ "name": "stderr",
142
+ "output_type": "stream",
143
+ "text": [
144
+ "/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
145
+ " from .autonotebook import tqdm as notebook_tqdm\n"
146
+ ]
147
+ },
148
  {
149
  "name": "stdout",
150
  "output_type": "stream",
 
158
  "name": "stderr",
159
  "output_type": "stream",
160
  "text": [
161
+ "Batches: 0%| | 0/1 [00:00<?, ?it/s]/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py:1520: FutureWarning: `encoder_attention_mask` is deprecated and will be removed in version 4.55.0 for `BertSdpaSelfAttention.forward`.\n",
162
  " return forward_call(*args, **kwargs)\n",
163
+ "Batches: 100%|████████████████████████████████████| 1/1 [00:02<00:00, 2.08s/it]\n"
164
  ]
165
  },
166
  {
 
169
  "text": [
170
  "Embeddings created successfully.\n",
171
  "Creating FAISS index with dimension 384...\n",
172
+ "FAISS index created with 27 vectors.\n",
173
+ "\n",
174
+ "Searching for top 3 interventions for query: 'This student is struggling to keep up with coursework, having failed one core cl...'\n",
175
+ "Found 3 relevant interventions.\n"
176
  ]
177
  },
178
  {
179
+ "data": {
180
+ "text/markdown": [
181
+ "**Top 3 Retrieved Strategies:**"
182
+ ],
183
+ "text/plain": [
184
+ "<IPython.core.display.Markdown object>"
185
+ ]
186
+ },
187
+ "metadata": {},
188
+ "output_type": "display_data"
189
+ },
190
+ {
191
+ "data": {
192
+ "text/markdown": [
193
+ "- **Strategy: Differentiating Intervention Tiers** (Source: *NCS_OTToolkit_2ndEd_October_2017_updated.pdf*, Relevance: 0.57)"
194
+ ],
195
+ "text/plain": [
196
+ "<IPython.core.display.Markdown object>"
197
+ ]
198
+ },
199
+ "metadata": {},
200
+ "output_type": "display_data"
201
+ },
202
+ {
203
+ "data": {
204
+ "text/markdown": [
205
+ "- **Tool: Intervention Tracking** (Source: *NCS_OTToolkit_2ndEd_October_2017_updated.pdf*, Relevance: 0.54)"
206
+ ],
207
+ "text/plain": [
208
+ "<IPython.core.display.Markdown object>"
209
+ ]
210
+ },
211
+ "metadata": {},
212
+ "output_type": "display_data"
213
+ },
214
+ {
215
+ "data": {
216
+ "text/markdown": [
217
+ "- **Tool: BAG Report (Example)** (Source: *NCS_OTToolkit_2ndEd_October_2017_updated.pdf*, Relevance: 0.53)"
218
+ ],
219
+ "text/plain": [
220
+ "<IPython.core.display.Markdown object>"
221
+ ]
222
+ },
223
+ "metadata": {},
224
+ "output_type": "display_data"
225
  }
226
  ],
227
  "source": [
228
+ "# Import the necessary functions from our project's code\n",
229
+ "from fot_recommender.rag_pipeline import (\n",
230
+ " load_knowledge_base,\n",
231
+ " initialize_embedding_model,\n",
232
+ " create_embeddings,\n",
233
+ " create_vector_db,\n",
234
+ " search_interventions,\n",
235
+ " generate_recommendation_summary\n",
236
+ ")\n",
237
+ "from fot_recommender.utils import display_recommendations\n",
238
+ "\n",
239
+ "# --- Load all the components of our RAG system ---\n",
240
+ "\n",
241
+ "# 1. Load the chunked knowledge base\n",
242
+ "kb_path = project_path / \"data\" / \"processed\" / \"knowledge_base_final_chunks.json\"\n",
243
+ "knowledge_base_chunks = load_knowledge_base(str(kb_path))\n",
244
+ "\n",
245
+ "# 2. Initialize the embedding model\n",
246
  "embedding_model = initialize_embedding_model()\n",
247
  "\n",
248
+ "# 3. Create embeddings and the vector database\n",
249
  "embeddings = create_embeddings(knowledge_base_chunks, embedding_model)\n",
250
+ "vector_db = create_vector_db(embeddings)\n",
251
+ "\n",
252
+ "# --- Perform the search! ---\n",
253
+ "retrieved_interventions = search_interventions(\n",
254
+ " query=student_query,\n",
255
+ " model=embedding_model,\n",
256
+ " index=vector_db,\n",
257
+ " knowledge_base=knowledge_base_chunks,\n",
258
+ " k=3,\n",
259
+ " min_similarity_score=0.4\n",
260
+ ")\n",
261
  "\n",
262
+ "# Display the titles of what we found\n",
263
+ "display(Markdown(\"**Top 3 Retrieved Strategies:**\"))\n",
264
+ "for chunk, score in retrieved_interventions:\n",
265
+ " display(Markdown(f\"- **{chunk['title']}** (Source: *{chunk['source_document']}*, Relevance: {score:.2f})\"))"
266
  ]
267
  },
268
  {
269
  "cell_type": "markdown",
270
+ "id": "2202209d",
271
  "metadata": {},
272
  "source": [
273
+ "## Step 4: Create the Recommendation (The \"Generation\" Step)\n",
274
  "\n",
275
+ "Finding the right documents is only half the battle. Raw research isn't very helpful to a busy teacher. \n",
276
  "\n",
277
+ "In this final step, we use a powerful Large Language Model (Google's Gemini API) to act as an expert instructional coach. We give it the student's story and the relevant strategies we just retrieved. The AI's job is to **synthesize** this information into a concise, practical recommendation tailored specifically for a teacher.\n",
 
 
 
278
  "\n",
279
+ "This is the final output of our system."
280
  ]
281
  },
282
  {
283
  "cell_type": "code",
284
+ "execution_count": 4,
285
+ "id": "62ee35bc",
286
  "metadata": {},
287
  "outputs": [
 
 
 
 
 
 
 
288
  {
289
  "name": "stdout",
290
  "output_type": "stream",
291
  "text": [
292
  "\n",
293
+ "Synthesizing recommendation for persona: 'teacher' using Gemini...\n",
294
+ "Synthesis complete.\n"
 
 
 
 
295
  ]
296
+ },
297
+ {
298
+ "data": {
299
+ "text/markdown": [
300
+ "### Final Synthesized Recommendation for the Teacher"
301
+ ],
302
+ "text/plain": [
303
+ "<IPython.core.display.Markdown object>"
304
+ ]
305
+ },
306
+ "metadata": {},
307
+ "output_type": "display_data"
308
+ },
309
+ {
310
+ "data": {
311
+ "text/markdown": [
312
+ "This student is experiencing academic difficulty, reflected in a 2.5 GPA and a failing grade in one core class, coupled with attendance concerns (88% attendance versus a 90% target) and one behavioral incident. To address these challenges and support the student's path to graduation, the following interventions are recommended:\n",
313
+ "\n",
314
+ "\n",
315
+ "**1. Implement a Tiered Intervention Strategy:** Determine the extent to which attendance is contributing to the student's academic struggles. (\"Strategy: Differentiating Intervention Tiers\"). If attendance is a significant factor, refer the student to the appropriate support services (Success Team or Attendance Dean, as indicated by the BAG Report format) to address these issues directly. This allows more focused support from the teaching staff for academic interventions.\n",
316
+ "\n",
317
+ "**2. Utilize a Robust Intervention Tracking System:** Implement a system to monitor the student's progress, focusing on attendance, GPA, and behavior. (\"Tool: Intervention Tracking\"). This system should clearly document interventions (e.g., tutoring sessions, mentorship meetings), and track the student’s progress in each core course (GPA and attendance rates) at two checkpoints within a ten-week period. The \"BAG Report\" format provides a useful template to track behavior, attendance and grades. This data will inform adjustments to the support plan.\n",
318
+ "\n",
319
+ "**3. Regularly Review the Student's \"BAG Report\" (or Equivalent):** Use a reporting mechanism (such as the BAG report example) to regularly review the student's performance across all three key areas: Behavior, Attendance, and Grades. This visual representation highlights areas of strength and areas requiring immediate intervention, allowing for proactive adjustments to support strategies. This aligns with the recommendation to monitor multiple key performance indicators to improve student outcomes effectively.\n"
320
+ ],
321
+ "text/plain": [
322
+ "<IPython.core.display.Markdown object>"
323
+ ]
324
+ },
325
+ "metadata": {},
326
+ "output_type": "display_data"
327
  }
328
  ],
329
  "source": [
330
+ "from dotenv import load_dotenv\n",
 
 
 
331
  "\n",
332
+ "# Load the API key from a .env file (if it exists)\n",
333
+ "load_dotenv(project_path / '.env') \n",
334
+ "api_key = os.getenv(\"FOT_GOOGLE_API_KEY\")\n",
335
  "\n",
336
+ "if not api_key:\n",
337
+ " print(\"✋ FOT_GOOGLE_API_KEY not found. Please provide your Google API key to generate the summary.\")\n",
338
+ " final_recommendation = \"(API Key not provided - could not generate summary)\"\n",
339
+ "else:\n",
340
+ " final_recommendation = generate_recommendation_summary(\n",
341
+ " retrieved_chunks=retrieved_interventions,\n",
342
+ " student_narrative=student_query,\n",
343
+ " api_key=api_key,\n",
344
+ " persona=\"teacher\"\n",
345
  " )\n",
346
  "\n",
347
+ "display(Markdown(\"### Final Synthesized Recommendation for the Teacher\"))\n",
348
+ "display(Markdown(final_recommendation))"
 
 
 
349
  ]
350
  },
351
  {
352
  "cell_type": "markdown",
353
+ "id": "d3718297",
354
  "metadata": {},
355
+ "source": [
356
+ "## Bonus: See the Evidence\n",
357
+ "\n",
358
+ "The recommendation above isn't just made up—it's directly grounded in the documents we retrieved. Here are the specific text snippets that the AI used to create its summary. This ensures our recommendations are always transparent and evidence-based."
359
+ ]
360
  },
361
  {
362
  "cell_type": "code",
363
+ "execution_count": 5,
364
+ "id": "1b0cb720",
365
  "metadata": {},
366
+ "outputs": [
367
+ {
368
+ "name": "stdout",
369
+ "output_type": "stream",
370
+ "text": [
371
+ "\n",
372
+ "--- Top Recommended Interventions ---\n",
373
+ "\n",
374
+ "--- Recommendation 1 (Similarity Score: 0.5735) ---\n",
375
+ " Title: Strategy: Differentiating Intervention Tiers\n",
376
+ " Source: NCS_OTToolkit_2ndEd_October_2017_updated.pdf (Pages: 46)\n",
377
+ " \n",
378
+ " Content Snippet:\n",
379
+ " \"To what degree is attendance playing a role in student performance? To whom do you refer Tier 3 students who have serious attendance issues (inside and outside of the school) so that the Success Team can really concentrate on supporting Tier 2 students?...\"\n",
380
+ "--------------------------------------------------\n",
381
+ "\n",
382
+ "--- Recommendation 2 (Similarity Score: 0.5416) ---\n",
383
+ " Title: Tool: Intervention Tracking\n",
384
+ " Source: NCS_OTToolkit_2ndEd_October_2017_updated.pdf (Pages: 49)\n",
385
+ " \n",
386
+ " Content Snippet:\n",
387
+ " \"Features of Good Intervention Tracking Tools:\n",
388
+ " • Name of the intervention and what key performance indicator it addresses (attendance, point-in-time On-Track rates, GPA, behavior metric, etc.)\n",
389
+ " • Names of the targeted students\n",
390
+ " ° If tracking grades, include each core course's average expressed as a percentage\n",
391
+ " • Intervention contacts/implementation evidence\n",
392
+ " ° Tutoring attendance\n",
393
+ " ° Mentorship contact dates\n",
394
+ " ° \"Office hours\" visits\n",
395
+ " • Point-in-time progress on the key performance...\"\n",
396
+ "--------------------------------------------------\n",
397
+ "\n",
398
+ "--- Recommendation 3 (Similarity Score: 0.5328) ---\n",
399
+ " Title: Tool: BAG Report (Example)\n",
400
+ " Source: NCS_OTToolkit_2ndEd_October_2017_updated.pdf (Pages: 61)\n",
401
+ " \n",
402
+ " Content Snippet:\n",
403
+ " \"Student: Keith\n",
404
+ " Grade Level: 9\n",
405
+ " 8th Period Teacher: Donson\n",
406
+ " The numbers below reflect totals through Semester 1\n",
407
+ " \n",
408
+ " BEHAVIOR - In what ways do I contribute to a Safe and Respectful school climate?\n",
409
+ " • # of Infractions (# of Major Infractions): 5 (1)\n",
410
+ " • # of Days of In-School-Suspension (ISS): 10\n",
411
+ " • # of Days of Out-of-School-Suspension (OSS): 0\n",
412
+ " If I have any questions regarding my misconducts, I should schedule an appointment with the Dean of Discipline.\n",
413
+ " \n",
414
+ " ATTENDANCE - Do my actions r...\"\n",
415
+ "--------------------------------------------------\n"
416
+ ]
417
+ }
418
+ ],
419
+ "source": [
420
+ "display_recommendations(retrieved_interventions)"
421
+ ]
422
  },
423
  {
424
  "cell_type": "markdown",
425
+ "id": "254d4cdf",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
426
  "metadata": {},
 
427
  "source": [
428
+ "## Explore the Live Demo!\n",
 
429
  "\n",
430
+ "You've seen the step-by-step process of how our RAG system turns a student's story into an actionable, evidence-based plan. Now, it's time to try it yourself with any student scenario you can imagine!\n",
 
431
  "\n",
432
+ "We have deployed this entire system as an interactive web application on Hugging Face Spaces. Click the link below to access the live demo—no setup or API key required.\n",
 
433
  "\n",
 
 
 
 
434
  "\n",
435
+ "#### [👉 Click Here to Launch the Live FOT Recommender API](https://huggingface.co/spaces/chuckfinca/fot-recommender-api)\n"
 
 
 
 
 
 
 
 
 
436
  ]
437
  },
438
  {
439
  "cell_type": "code",
440
  "execution_count": null,
441
+ "id": "64867bc5-2762-4c69-aa72-e4e7cf911019",
442
  "metadata": {},
443
  "outputs": [],
444
  "source": []