Spaces:

csong03
/

14c_chatbot

Runtime error

App Files Files Community

14c_chatbot / data /BPS_eligible_school_tool.md

csong03

Initial Space upload with LFS-tracked binaries

9e118e4 3 months ago

preview code

raw

history blame contribute delete

8.09 kB

	# BPS Eligible School Tool

	## Overview

	This tool lets an LLM or backend service find eligible Boston Public Schools for a student based on:

	- grade level
	- home address
	- ZIP code
	- city/state
	- home language

	It uses the Boston Explore / Avela eligibility API to get the list of ineligible schools, then joins that result with our local school catalog and returns the remaining eligible BPS schools.

	In other words:

	```text
	eligible schools = candidate BPS schools for the grade - ineligible schools returned by Avela
	```

	---

	## Why this tool exists

	The original `check_eligibility.py` script successfully called the Avela API, but it returned only the API's `ineligibleSchools` list.

	That was not enough for the chatbot pipeline, because the chatbot needs a list of schools the student can actually apply to.

	The new tool solves that by:

	1. calling the eligibility API
	2. loading the local school data
	3. filtering to Boston Public Schools only
	4. filtering to schools that serve the requested grade
	5. subtracting ineligible schools
	6. returning the final eligible school list

	---

	## Main file

	The current tool implementation lives in:

	```text
	check_eligibility_tool.py
	```

	---

	## Main function

	```python
	find_eligible_schools(
	grade_level: str,
	street_address: str,
	zip_code: str,
	city: str = "Boston",
	state: str = "MA",
	street_address_line2: str = "",
	home_language: str = "English",
	dataset_path: str = "raw_data/choice_tool_raw.json",
	include_ineligible: bool = False,
	) -> dict[str, Any]
	```

	### What it returns

	A dictionary with fields like:

	```json
	{
	"enrollment_period_name": "2026-2027 School Year",
	"eligible_schools": [...],
	"eligible_count": 16,
	"candidate_school_count": 73,
	"ineligible_count": 913,
	"matched_ineligible_count": 57,
	"error": null
	}
	```

	### Important return fields

	- `eligible_schools`: the final list the chatbot should use
	- `eligible_count`: number of eligible schools returned
	- `candidate_school_count`: number of BPS schools in the local dataset that serve the requested grade
	- `ineligible_count`: total number of ineligible providers returned by Avela
	- `matched_ineligible_count`: number of those ineligible providers that matched candidate BPS schools in our local dataset
	- `error`: error message if something failed

	---

	## Tool definition for LLM function calling

	The file exposes:

	```python
	TOOL_DEFINITION
	```

	This is the function schema the LLM can use to call the tool.

	Tool name:

	```text
	find_eligible_schools
	```

	Expected user-facing parameters:

	- `grade_level`
	- `street_address`
	- `zip_code`
	- `city`
	- `state`
	- `street_address_line2`
	- `home_language`

	Note: `dataset_path` should stay internal and should not be exposed to the model in production.

	---

	## Minimal dispatcher

	The file also exposes:

	```python
	handle_tool_call(function_name: str, args: dict[str, Any])
	```

	This makes it easy to plug into a future chatbot tool handler.

	Example:

	```python
	args = {
	"grade_level": "K2",
	"street_address": "2300 Washington St",
	"zip_code": "02119",
	"city": "Boston",
	"state": "MA",
	"home_language": "English",
	}

	result = handle_tool_call("find_eligible_schools", args)
	```

	---

	## Data sources used

	### 1. Avela API

	Used to determine which schools are ineligible for the student.

	Endpoints used:

	- `GET /enrollmentPeriods`
	- `POST /formTemplates/{FORM_TEMPLATE_ID}/findEligibility`

	### 2. Local dataset

	Used to map API references back to school metadata and produce the final eligible list.

	Current catalog path:

	```text
	raw_data/choice_tool_raw.json
	```

	This dataset includes more than just BPS schools, so the tool filters to:

	```python
	provider_type == "Boston Public School"
	```

	---

	## Eligibility logic

	The tool works in this order:

	1. validate grade and language inputs
	2. fetch enrollment periods
	3. build the eligibility API payload
	4. call Avela's `findEligibility`
	5. collect `referenceId` values from `ineligibleSchools`
	6. load the local catalog
	7. filter the catalog to BPS schools that serve the requested grade
	8. subtract matching ineligible school IDs
	9. return the remaining schools as eligible

	### Set logic

	```text
	candidate_ids = all BPS school IDs for the requested grade
	matched_ineligible_ids = candidate_ids ∩ Avela_ineligible_ids
	eligible_ids = candidate_ids - matched_ineligible_ids
	```

	---

	## Testing

	### Direct script test

	Run:

	```bash
	python3 check_eligibility_tool.py
	```

	This runs an example address through the tool and prints the result.

	### Manual tool-call test

	Use `test_tool_call.py`:

	```python
	import json
	import check_eligibility_tool
	from check_eligibility_tool import TOOL_DEFINITION, handle_tool_call, get_enrollment_periods

	print("Imported from:", check_eligibility_tool.__file__)
	print("Tool name:", TOOL_DEFINITION["name"])

	periods, periods_error = get_enrollment_periods()
	print("Enrollment periods error:", periods_error)
	print("Enrollment periods count:", len(periods))

	args = {
	"grade_level": "K2",
	"street_address": "2300 Washington St",
	"zip_code": "02119",
	"city": "Boston",
	"state": "MA",
	"home_language": "English",
	"dataset_path": "raw_data/choice_tool_raw.json",
	}

	result = handle_tool_call("find_eligible_schools", args)
	print(json.dumps(result, indent=2))
	```

	Run:

	```bash
	python3 test_tool_call.py
	```

	---

	## Validation that the tool is correct

	The tool was validated in a few ways:

	### 1. Count math

	For the tested input:

	- `candidate_school_count = 73`
	- `matched_ineligible_count = 57`
	- `eligible_count = 16`

	Check:

	```text
	73 - 57 = 16
	```

	### 2. Website comparison

	The eligible school results matched the Boston Explore website for the tested address.

	### 3. Join logic sanity

	The tool only subtracts schools whose local `id` matches an Avela `referenceId` after normalization.

	---

	## Example tested input

	```python
	result = find_eligible_schools(
	grade_level="K2",
	street_address="2300 Washington St",
	zip_code="02119",
	city="Boston",
	state="MA",
	home_language="English",
	dataset_path="raw_data/choice_tool_raw.json",
	include_ineligible=False,
	)
	```

	Observed result summary:

	- enrollment period fetched successfully
	- 73 candidate BPS schools for K2
	- 57 matched as ineligible
	- 16 eligible schools returned

	---

	## Known implementation details

	### Enrollment periods response format

	The Avela `enrollmentPeriods` endpoint may return a dictionary instead of a raw list.

	The tool handles both of these cases:

	- list response
	- dict response containing `enrollment_period`

	### UUID mappings

	The grade and language option UUIDs must match the values from the original working `check_eligibility.py` implementation.

	These mappings were preserved because the original script successfully returned working eligibility results.

	---

	## Recommended production usage

	In production, the chatbot pipeline should work like this:

	```text
	User question
	-> LLM extracts structured args
	-> find_eligible_schools tool call
	-> return eligible_schools
	-> ranking/filtering/retrieval over eligible_schools
	-> final chatbot answer
	```

	The chatbot should use:

	```python
	result["eligible_schools"]
	```

	as the input to ranking and school recommendation logic.

	---

	## Suggested follow-up improvements

	1. Hide `dataset_path` from the LLM-facing tool interface.
	2. Add optional debug mode for returning matched ineligible IDs.
	3. Add unit tests for:
	- grade filtering
	- ID normalization
	- candidate minus ineligible set subtraction
	4. Add ranking layer on top of `eligible_schools`.
	5. Wrap the tool in the actual chatbot tool handler.

	---

	## Summary

	This new tool converts the old “return ineligible schools” flow into a chatbot-ready “return eligible Boston Public Schools” flow.

	It is now verified to:

	- call the API correctly
	- join against local school data correctly
	- return eligible BPS schools correctly
	- support LLM tool-calling via `TOOL_DEFINITION`
	- support backend execution via `handle_tool_call`