14c_chatbot / data /raw_data /our_data_model.md
csong03
Initial Space upload with LFS-tracked binaries
9e118e4

A newer version of the Gradio SDK is available: 6.15.2

Upgrade

School Data Model

Field Reference

Field Status Notes
id βœ… Included
school βœ… Included
DBA βœ… Included
address βœ… Included
slim_address ❌ Excluded Unnecessary given address
latitude βœ… Included
longitude βœ… Included
grade_span πŸ”€ Merged Merged with grade_range and grades_filter into single grade_min to grade_max field
grade_range πŸ”€ Merged See grade_span
grades_filter πŸ”€ Merged See grade_span
provider_type βœ… Included
surround_care βœ… Included Filter only applicable to nonBPS schools; indicative of after/before school programs
aftercare ❌ Excluded Subset of surround_care
extended_day ❌ Excluded Empty field
curriculum βœ… Included Categorical enumeration; only applicable for nonBPS
hours_of_operation βœ… Included
phone_number βœ… Included
email βœ… Included
website βœ… Included
bps_quality_tier ❌ Excluded Better to refer to state_report_card and school_quality_framework
programs ❌ Excluded Only relevant for nonBPS; contains semantic data but avoiding RAG for nonBPS for now
facility_features ❌ Excluded Empty for all schools
school_size ❌ Excluded Empty for all schools
sports πŸ“š RAG Added to RAG for BPS schools; 106 unique values make it impractical as a filter, but useful for semantic queries
tuition βœ… Included 92% fill rate for nonBPS (BPS are free); usable as filter with caveat for empty values
free_upk_hours ❌ Excluded Strict subset of UPK boolean
eligibility ❌ Excluded LLM should not make eligibility decisions; directs user to choice tool
separate_application ❌ Excluded Links are defunct
headstart βœ… Included Filter for nonBPS; directs to Boston ABCD Head Start for eligibility
accepts_ccfa βœ… Included Useful for nonBPS financial assistance; directs to CCFA info
language_programming_text πŸ“š RAG and πŸ”€ Merged for BPS we use as RAG, for nonBPS we union with dual_language and language_programming_filter to create new boolean called has_language_program
dual_language πŸ”€ Merged See language_programming_text
language_programming_filter πŸ”€ Merged See language_programming_text
stem_steam ❌ Excluded Unreliable to use as filter.. better to rely on RAG from school mission/academic statements
vocational_technology ❌ Excluded Same reasoning as stem_steam
early_college_dual_enrollment πŸ“š RAG Semantic description of eligible colleges included in RAG
international_baccalaureate βœ… Included Very specific; only 3 schools offer it β€” model reports directly
advanced_placement βœ… Included Acts as filter; similar reasoning to international_baccalaureate; retrieval includes text description of available classes
arts ❌ Excluded Same reasoning as stem_steam
clubs ❌ Excluded Empty for all schools
specialized_education_programs πŸ“š RAG Semantic descriptions of SPED programs
uniform βœ… Included High fill rate for BPS (good filter); lower for nonBPS (include confidence caveat in LLM response)
after_school_program πŸ“š RAG Semantic descriptions
overview_mission_statement πŸ“š RAG Core identity of each school
unique_features πŸ“š RAG What sets the school apart
arts_rooms ❌ Excluded Empty across all schools
athletic_field ❌ Excluded Empty across all schools
cafeteria ❌ Excluded Empty across all schools
gymnasium ❌ Excluded Empty across all schools
library ❌ Excluded Empty across all schools
music_room ❌ Excluded Empty across all schools
outdoor_classrooms ❌ Excluded Empty across all schools
playground ❌ Excluded Empty across all schools
pool ❌ Excluded Empty across all schools
science_lab ❌ Excluded Empty across all schools
partners πŸ“š RAG Lists partner programs; good for semantic search
partners_link ❌ Excluded Informational for schools, not users
UPK βœ… Included Good filter for BPS and nonBPS; potential for additional UPK, LLM should concede the public data is scarce in this regard
BPS_eligibility ❌ Excluded Directing users to online choice tool
ADA βœ… Included Clean filter for BPS; fewer nonBPS list it β€” retrieval should note this caveat to LLM
tiers_text ❌ Excluded Never used in raw data
transportation ❌ Excluded Directs user to public school transportation system link
prek_bps_connector ❌ Excluded Never used in raw data
special_admission_school πŸ”€ Merged Merged with special_admission_filter and special_admission_link; includes link for more info
special_admission_filter πŸ”€ Merged See special_admission_school
special_admission_link πŸ”€ Merged See special_admission_school
school_quality_framework βœ… Included With state_report_card, directs users to compare school performance
school_quality_tier ❌ Excluded Unnecessary given school_quality_framework and state_report_card
preview_session_1 ❌ Excluded Outdated
preview_session_2 ❌ Excluded Outdated
preview_session_3 ❌ Excluded Outdated
point_of_contact βœ… Included Important for retrieval
school_leader βœ… Included Important for retrieval
before_school_program πŸ“š RAG Semantic descriptions
ada_description πŸ“š RAG Semantic descriptions
uniform_policy ❌ Excluded Unnecessary given uniform filter
extra_curriculars_text πŸ“š RAG Common user question; good semantic content
innovation_pathways ❌ Excluded Unused in raw data
specialized_education_filter ❌ Excluded Redundant with specialized_education_programs
announcement_text ❌ Excluded Unused
other_academic_programs ❌ Excluded Unused
show_eligibility_check ❌ Excluded Unused
BuildCare βœ… Included Filter for nonBPS schools only
family_engagement_opportunities πŸ“š RAG Text descriptions; important for users curious about family engagement
CTE_Pathways_TXT πŸ“š RAG semantic descriptions of career options
state_report_card βœ… Included See school_quality_framework

Status Key

Symbol Meaning
βœ… Included Used as a structured filter or direct retrieval field
❌ Excluded Not included in the data model
πŸ”€ Merged Combined with other fields
πŸ“š RAG Included in semantic RAG embeddings