Spaces:

CommunityOne
/

open-navigator

Running on CPU Upgrade

App Files Files Community

open-navigator / docs /SCALE_AND_SEARCH_PATTERNS.md

jcbowyer

Deploy: Consolidated gold tables, fixed nginx docs routing

896453f verified 28 days ago

preview code

raw

history blame contribute delete

27.3 kB

	# Scale and Search Patterns: End-to-End Civic Tech Projects

	This guide analyzes 6 additional civic tech projects focused on full-stack deployments, large-scale data aggregation, and public search portals. These complement our existing integration (Civic Scraper, City Scrapers, CDP, Engagic, Councilmatic) with new patterns for:

	- 🤖 AI summarization (OpenTowns, MeetingBank)
	- 🔍 Multi-jurisdiction search (CivicBand, LocalView)
	- 🔔 Keyword alerting (OpenTowns)
	- 📊 Research-grade pipelines (LocalView, MeetingBank)
	- 🌍 International adaptability (OpenCouncil)

	---

	## 🎯 What's NEW vs. Our Existing Integration

	\| Pattern \| Already Have \| NEW from These Projects \|
	\|---------\|--------------\|-------------------------\|
	\| Platform detection \| ✅ Civic Scraper \| - \|
	\| Event schema \| ✅ City Scrapers \| - \|
	\| Video ingestion \| ✅ CDP \| ✅ LocalView scale patterns \|
	\| Matter tracking \| ✅ Engagic \| - \|
	\| Search UX \| ✅ Councilmatic \| ✅ CivicBand cross-jurisdiction \|
	\| AI Summarization \| ❌ \| ✅ OpenTowns, MeetingBank \|
	\| Keyword Alerts \| ❌ \| ✅ OpenTowns \|
	\| Scale (1,000+ jurisdictions) \| ⚠️ Partial \| ✅ CivicBand, LocalView \|
	\| International patterns \| ❌ \| ✅ OpenCouncil \|

	---

	## 📚 Project Analysis

	### 1. Council Data Project (CDP) ⭐ Already Integrated

	Status: Already documented in `INTEGRATION_GUIDE.md`

	Key patterns we already use:
	- Video transcript ingestion
	- Searchable transcript storage
	- Event indexing pipeline

	See: `docs/INTEGRATION_GUIDE.md` Section 4

	---

	### 2. OpenTowns 🆕 AI Summarization Pioneer

	GitHub: https://opentowns.org
	License: Open civic-tech (check specific repo)
	Focus: Small towns, AI-generated summaries, keyword alerts

	#### 🔥 What to Adopt

	A. AI Summarization Pattern
	```python
	# They generate readable summaries from raw transcripts/PDFs
	# Pattern: transcript → summary → key decisions

	from openai import OpenAI
	from models.meeting_event import MeetingEvent

	async def generate_meeting_summary(event: MeetingEvent, transcript: str) -> dict:
	"""
	OpenTowns pattern: Generate human-readable meeting summaries.

	Returns:
	{
	'executive_summary': str, # 2-3 sentences
	'key_decisions': list[str], # Bullet points
	'health_policy_items': list[str], # Filtered for oral health
	'next_actions': list[str] # Follow-up items
	}
	"""
	client = OpenAI()

	prompt = f"""
	Summarize this local government meeting for public understanding.

	Meeting: {event.title}
	Date: {event.start.strftime('%B %d, %Y')}
	Transcript: {transcript[:10000]} # First 10k chars

	Provide:
	1. Executive summary (2-3 sentences)
	2. Key decisions made (bullet points)
	3. Health policy items (if any)
	4. Next actions/follow-ups

	Focus on: What decisions were made? What happens next?
	"""

	response = client.chat.completions.create(
	model="gpt-4o-mini", # Cost-effective for summaries
	messages=[
	{"role": "system", "content": "You are a civic engagement assistant helping residents understand local government."},
	{"role": "user", "content": prompt}
	],
	temperature=0.3 # Lower for factual accuracy
	)

	# Parse response into structured format
	summary_text = response.choices[0].message.content

	return {
	'executive_summary': extract_section(summary_text, 'Executive summary'),
	'key_decisions': extract_bullets(summary_text, 'Key decisions'),
	'health_policy_items': extract_bullets(summary_text, 'Health policy'),
	'next_actions': extract_bullets(summary_text, 'Next actions'),
	'raw_summary': summary_text
	}
	```

	B. Keyword Alert System
	```python
	# OpenTowns sends alerts when keywords appear in meetings
	# Pattern: Watch list → match detection → user notification

	from typing import List, Dict
	import re

	class KeywordAlertSystem:
	"""
	OpenTowns pattern: Alert users when keywords appear in meetings.
	"""

	# Oral health keyword categories
	KEYWORD_CATEGORIES = {
	'fluoridation': [
	'fluoride', 'fluoridation', 'water treatment',
	'community water fluoridation', 'CWF'
	],
	'dental_access': [
	'dental', 'dentist', 'oral health', 'teeth',
	'medicaid dental', 'dental clinic'
	],
	'public_health': [
	'health department', 'public health', 'CDC',
	'preventive care', 'health equity'
	]
	}

	def detect_keywords(self, text: str) -> Dict[str, List[str]]:
	"""
	Find all matching keywords in text.

	Returns: {'fluoridation': ['fluoride', 'CWF'], ...}
	"""
	text_lower = text.lower()
	matches = {}

	for category, keywords in self.KEYWORD_CATEGORIES.items():
	found = []
	for keyword in keywords:
	# Word boundary matching
	pattern = r'\b' + re.escape(keyword.lower()) + r'\b'
	if re.search(pattern, text_lower):
	found.append(keyword)

	if found:
	matches[category] = found

	return matches

	def generate_alert(self, event: MeetingEvent, matches: Dict[str, List[str]]) -> dict:
	"""
	Create alert notification for users.
	"""
	return {
	'alert_type': 'keyword_match',
	'jurisdiction': f"{event.jurisdiction_name}, {event.state_code}",
	'meeting_title': event.title,
	'meeting_date': event.start.isoformat(),
	'categories_matched': list(matches.keys()),
	'keywords_found': [kw for kws in matches.values() for kw in kws],
	'meeting_url': event.source,
	'priority': 'high' if 'fluoridation' in matches else 'medium'
	}
	```

	Implementation Priority: 🔥 HIGH - Summaries make data usable for advocates

	---

	### 3. LocalView 🆕 Research-Grade Scale

	Website: https://www.localview.net
	GitHub: https://mellonurbanism.harvard.edu/localview
	License: Open-source data pipeline
	Scale: Nationwide coverage, largest public dataset

	#### 🔥 What to Adopt

	A. Scale Architecture Patterns

	LocalView handles thousands of jurisdictions with:
	1. Batch processing (not real-time)
	2. Distributed storage (videos + transcripts)
	3. Quality metrics (completeness scoring)

	```python
	# LocalView pattern: Process jurisdictions in batches with quality tracking

	from dataclasses import dataclass
	from datetime import datetime
	from typing import Optional

	@dataclass
	class JurisdictionQuality:
	"""
	LocalView pattern: Track data quality per jurisdiction.
	"""
	jurisdiction_name: str
	state_code: str

	# Completeness metrics
	total_meetings_expected: int # Based on calendar
	total_meetings_found: int
	meetings_with_agendas: int
	meetings_with_minutes: int
	meetings_with_videos: int
	meetings_with_transcripts: int

	# Freshness
	last_scraped: datetime
	last_meeting_found: Optional[datetime]
	scraping_frequency: str # 'daily', 'weekly', 'monthly'

	# Health metrics
	consecutive_failures: int
	last_success: Optional[datetime]

	@property
	def completeness_score(self) -> float:
	"""
	Overall data quality score (0-100).
	"""
	if self.total_meetings_expected == 0:
	return 0.0

	found_rate = self.total_meetings_found / self.total_meetings_expected
	agenda_rate = self.meetings_with_agendas / max(self.total_meetings_found, 1)
	minutes_rate = self.meetings_with_minutes / max(self.total_meetings_found, 1)

	# Weighted average
	score = (
	found_rate * 40 + # 40%: Finding meetings
	agenda_rate * 30 + # 30%: Having agendas
	minutes_rate * 30 # 30%: Having minutes
	)

	return min(score * 100, 100.0)

	@property
	def health_status(self) -> str:
	"""
	Scraper health: healthy, degraded, failed
	"""
	if self.consecutive_failures >= 5:
	return 'failed'
	elif self.consecutive_failures >= 2:
	return 'degraded'
	else:
	return 'healthy'
	```

	B. Batch Processing Strategy
	```python
	# LocalView processes in batches, not all-at-once

	from pyspark.sql import SparkSession
	from typing import Iterator

	def process_jurisdictions_in_batches(
	spark: SparkSession,
	batch_size: int = 100,
	priority_filter: str = 'high'
	) -> Iterator[dict]:
	"""
	LocalView pattern: Process large numbers of jurisdictions efficiently.

	Strategy:
	1. Load high-priority jurisdictions first
	2. Process in batches to manage memory
	3. Track quality metrics per batch
	4. Resume from failures
	"""
	# Load targets from Gold layer
	targets_df = spark.read.format("delta").load("data/delta/gold/scraping_targets")

	# Filter and sort
	priority_targets = targets_df \
	.filter(f"priority_tier = '{priority_filter}'") \
	.orderBy("priority_score", ascending=False)

	total_targets = priority_targets.count()

	# Process in batches
	for offset in range(0, total_targets, batch_size):
	batch_df = priority_targets.limit(batch_size).offset(offset)

	batch_results = {
	'batch_number': offset // batch_size + 1,
	'batch_size': batch_size,
	'jurisdictions_processed': 0,
	'meetings_found': 0,
	'errors': []
	}

	for row in batch_df.collect():
	try:
	# Scrape jurisdiction
	meetings = scrape_jurisdiction(row['url'], row['platform'])
	batch_results['jurisdictions_processed'] += 1
	batch_results['meetings_found'] += len(meetings)

	except Exception as e:
	batch_results['errors'].append({
	'jurisdiction': row['jurisdiction_name'],
	'error': str(e)
	})

	yield batch_results
	```

	Implementation Priority: 🔥 HIGH - Essential for scaling to 32,333 municipalities

	---

	### 4. MeetingBank 🆕 Summarization Research

	Website: https://meetingbank.github.io
	GitHub: Linked from site
	License: Open dataset
	Focus: 6 cities, high-quality summarization benchmark

	#### 🔥 What to Adopt

	A. Summarization Quality Benchmarks

	MeetingBank is used in academic research for summarization. They have:
	- Gold-standard human summaries (for validation)
	- Multiple summary lengths (short, medium, long)
	- Evaluation metrics (ROUGE, BERTScore)

	```python
	# MeetingBank pattern: Validate AI summaries against quality benchmarks

	from typing import Dict
	import numpy as np

	class SummaryQualityValidator:
	"""
	MeetingBank pattern: Ensure AI summaries meet quality standards.
	"""

	# Quality thresholds from academic research
	MIN_ROUGE_L = 0.25 # ROUGE-L F1 score
	MIN_LENGTH_RATIO = 0.05 # Summary should be 5-20% of original
	MAX_LENGTH_RATIO = 0.20

	def validate_summary(self, original: str, summary: str) -> Dict[str, any]:
	"""
	Check if summary meets quality standards.
	"""
	# Length checks
	orig_words = len(original.split())
	summ_words = len(summary.split())
	length_ratio = summ_words / orig_words if orig_words > 0 else 0

	# Basic quality checks
	checks = {
	'length_appropriate': self.MIN_LENGTH_RATIO <= length_ratio <= self.MAX_LENGTH_RATIO,
	'has_key_terms': self._check_key_terms(original, summary),
	'no_repetition': self._check_repetition(summary),
	'proper_structure': self._check_structure(summary),
	}

	return {
	'passes_validation': all(checks.values()),
	'checks': checks,
	'length_ratio': length_ratio,
	'word_count': summ_words,
	'quality_score': sum(checks.values()) / len(checks)
	}

	def _check_key_terms(self, original: str, summary: str) -> bool:
	"""
	Ensure summary includes key terms from original.
	"""
	# Extract important terms (simplified - use TF-IDF in production)
	orig_words = set(original.lower().split())
	summ_words = set(summary.lower().split())

	# At least 30% overlap of unique terms
	overlap = len(orig_words & summ_words) / len(orig_words)
	return overlap >= 0.30

	def _check_repetition(self, summary: str) -> bool:
	"""
	Check for excessive repetition (indicates poor quality).
	"""
	sentences = summary.split('.')
	unique_ratio = len(set(sentences)) / len(sentences) if sentences else 0
	return unique_ratio >= 0.80 # At least 80% unique sentences

	def _check_structure(self, summary: str) -> bool:
	"""
	Check for proper summary structure.
	"""
	# Should have multiple sentences
	sentences = [s.strip() for s in summary.split('.') if s.strip()]
	return len(sentences) >= 2 and len(sentences) <= 10
	```

	Implementation Priority: 🟡 MEDIUM - Important for quality, but MVP can use basic summaries

	---

	### 5. CivicBand 🆕 Multi-Jurisdiction Search

	Website: https://civic.band
	GitHub: Linked from site (Raft Foundation)
	Scale: 1,000+ municipalities
	Focus: Google-like search across jurisdictions

	#### 🔥 What to Adopt

	A. Cross-Jurisdiction Search Architecture

	CivicBand lets users search "fluoridation" and get results from all municipalities at once.

	```python
	# CivicBand pattern: Federated search across jurisdictions

	from elasticsearch import Elasticsearch # Or Meilisearch for open-source
	from typing import List, Dict
	from models.meeting_event import MeetingEvent

	class CrossJurisdictionSearch:
	"""
	CivicBand pattern: Search meetings across all jurisdictions.
	"""

	def __init__(self):
	# Use Meilisearch (open-source) or Elasticsearch
	self.es = Elasticsearch(['http://localhost:9200'])
	self.index_name = 'meeting_events'

	def index_meeting(self, event: MeetingEvent):
	"""
	Add meeting to search index.
	"""
	doc = {
	'id': event.id,
	'title': event.title,
	'description': event.description,
	'jurisdiction': event.jurisdiction_name,
	'state': event.state_code,
	'date': event.start.isoformat(),
	'full_text': self._build_searchable_text(event),
	'agenda_url': next((link.href for link in event.links if 'agenda' in link.title.lower()), None),
	'oral_health_relevant': event.oral_health_relevant,
	'keywords': event.keywords_found
	}

	self.es.index(index=self.index_name, id=event.id, document=doc)

	def search(
	self,
	query: str,
	states: List[str] = None,
	date_range: tuple = None,
	oral_health_only: bool = False
	) -> List[Dict]:
	"""
	Search across all jurisdictions.

	Example:
	search("fluoridation", states=['AL', 'GA'], oral_health_only=True)
	"""
	must_clauses = [
	{"multi_match": {
	"query": query,
	"fields": ["title^3", "description^2", "full_text"], # Boost title matches
	"type": "best_fields"
	}}
	]

	# Filter by state
	if states:
	must_clauses.append({"terms": {"state": states}})

	# Filter by date range
	if date_range:
	must_clauses.append({
	"range": {"date": {"gte": date_range[0], "lte": date_range[1]}}
	})

	# Filter oral health only
	if oral_health_only:
	must_clauses.append({"term": {"oral_health_relevant": True}})

	search_query = {
	"query": {"bool": {"must": must_clauses}},
	"size": 100,
	"highlight": {
	"fields": {
	"title": {},
	"description": {},
	"full_text": {"fragment_size": 150}
	}
	},
	"sort": [
	{"_score": "desc"},
	{"date": "desc"}
	]
	}

	results = self.es.search(index=self.index_name, body=search_query)

	return [{
	'jurisdiction': hit['_source']['jurisdiction'],
	'state': hit['_source']['state'],
	'title': hit['_source']['title'],
	'date': hit['_source']['date'],
	'snippet': hit.get('highlight', {}).get('full_text', [''])[0],
	'url': hit['_source']['agenda_url'],
	'relevance_score': hit['_score']
	} for hit in results['hits']['hits']]

	def _build_searchable_text(self, event: MeetingEvent) -> str:
	"""
	Combine all text fields for indexing.
	"""
	parts = [
	event.title or '',
	event.description or '',
	' '.join(event.keywords_found),
	' '.join(link.title for link in event.links)
	]
	return ' '.join(parts)
	```

	B. Jurisdiction Faceting
	```python
	# CivicBand shows result counts by jurisdiction

	def get_search_facets(query: str) -> Dict[str, int]:
	"""
	Show how many results per jurisdiction.

	Example output:
	{
	'Birmingham, AL': 12,
	'Atlanta, GA': 8,
	'Montgomery, AL': 5
	}
	"""
	search_query = {
	"query": {"multi_match": {"query": query, "fields": ["title", "full_text"]}},
	"size": 0, # We only want aggregations
	"aggs": {
	"by_jurisdiction": {
	"terms": {
	"field": "jurisdiction.keyword",
	"size": 50 # Top 50 jurisdictions
	},
	"aggs": {
	"by_state": {
	"terms": {"field": "state.keyword"}
	}
	}
	}
	}
	}

	results = self.es.search(index=self.index_name, body=search_query)

	facets = {}
	for bucket in results['aggregations']['by_jurisdiction']['buckets']:
	jurisdiction = bucket['key']
	count = bucket['doc_count']
	state = bucket['by_state']['buckets'][0]['key']
	facets[f"{jurisdiction}, {state}"] = count

	return facets
	```

	Implementation Priority: 🟡 MEDIUM - Valuable for end-users, but scraping comes first

	---

	### 6. OpenCouncil 🆕 International Adaptability

	Website: https://opencouncil.gr
	GitHub: https://github.com/schemalabz/opencouncil
	License: Open-source
	Focus: Greek councils, but adaptable to U.S.

	#### 🔥 What to Adopt

	A. Internationalization Patterns

	OpenCouncil works in Greece (different government structure). This teaches us:
	- Flexible schema (not hardcoded to U.S. structures)
	- Configurable jurisdiction types (councils, boards, commissions)
	- Multi-language support (not needed now, but good architecture)

	```python
	# OpenCouncil pattern: Flexible jurisdiction configuration

	from enum import Enum
	from dataclasses import dataclass
	from typing import List, Optional

	class GovernmentLevel(Enum):
	"""
	OpenCouncil pattern: Support multiple government structures.
	"""
	MUNICIPAL = "municipal" # City/town councils
	COUNTY = "county" # County boards
	TOWNSHIP = "township" # Township boards
	SCHOOL_DISTRICT = "school" # School boards
	SPECIAL_DISTRICT = "special" # Water, fire, etc.
	STATE = "state" # State agencies (future)

	@dataclass
	class JurisdictionConfig:
	"""
	OpenCouncil pattern: Configure each jurisdiction's unique structure.
	"""
	jurisdiction_name: str
	government_level: GovernmentLevel

	# Meeting schedule
	typical_meeting_frequency: str # 'weekly', 'biweekly', 'monthly'
	typical_meeting_days: List[str] # ['Monday', 'Thursday']
	typical_meeting_time: str # '18:00'

	# Website structure
	calendar_url: Optional[str]
	agenda_url_pattern: Optional[str] # Template: "https://example.gov/agenda-{date}"
	minutes_url_pattern: Optional[str]

	# Legislative bodies
	bodies: List[str] # ['City Council', 'Planning Commission', 'Board of Health']

	# Custom fields
	metadata: dict # For jurisdiction-specific data

	# Example: Configure Birmingham, AL
	BIRMINGHAM_CONFIG = JurisdictionConfig(
	jurisdiction_name="Birmingham",
	government_level=GovernmentLevel.MUNICIPAL,
	typical_meeting_frequency='biweekly',
	typical_meeting_days=['Tuesday'],
	typical_meeting_time='18:00',
	calendar_url="https://birminghamal.gov/council/meetings",
	bodies=['City Council', 'Board of Health', 'Planning Commission'],
	metadata={'population': 200733, 'oral_health_priority': 'high'}
	)
	```

	Implementation Priority: 🟢 LOW - Good architecture, but not urgent

	---

	## 🎯 Implementation Roadmap

	### Phase 1: AI Summarization (OpenTowns pattern) 🔥
	Priority: HIGH
	Timeline: 1-2 weeks
	Depends on: Existing OpenAI integration

	```python
	# TODO: Implement in extraction/summarizer.py
	- [ ] Generate executive summaries from meeting transcripts
	- [ ] Extract key decisions as bullet points
	- [ ] Identify health policy items
	- [ ] Add quality validation (MeetingBank patterns)
	```

	### Phase 2: Keyword Alerts (OpenTowns pattern) 🔥
	Priority: HIGH
	Timeline: 1 week
	Depends on: Meeting data ingestion

	```python
	# TODO: Implement in alerts/keyword_monitor.py
	- [ ] Define oral health keyword categories
	- [ ] Pattern matching with word boundaries
	- [ ] Generate alerts for users
	- [ ] Email/webhook notification system
	```

	### Phase 3: Scale Architecture (LocalView pattern) 🔥
	Priority: HIGH
	Timeline: 2 weeks
	Depends on: Platform scrapers

	```python
	# TODO: Implement in discovery/batch_processor.py
	- [ ] Quality metrics per jurisdiction
	- [ ] Batch processing (100 at a time)
	- [ ] Failure tracking and retry
	- [ ] Completeness scoring
	```

	### Phase 4: Multi-Jurisdiction Search (CivicBand pattern) 🟡
	Priority: MEDIUM
	Timeline: 2-3 weeks
	Depends on: Significant meeting data

	```python
	# TODO: Implement in search/federated_search.py
	- [ ] Set up Elasticsearch or Meilisearch
	- [ ] Index all meetings
	- [ ] Cross-jurisdiction search API
	- [ ] Jurisdiction faceting
	```

	### Phase 5: Quality Validation (MeetingBank pattern) 🟡
	Priority: MEDIUM
	Timeline: 1 week
	Depends on: AI summarization

	```python
	# TODO: Implement in extraction/quality_validator.py
	- [ ] Summary length validation
	- [ ] Key term extraction
	- [ ] Repetition detection
	- [ ] Structure checking
	```

	### Phase 6: Flexible Config (OpenCouncil pattern) 🟢
	Priority: LOW
	Timeline: 1 week
	Depends on: None

	```python
	# TODO: Implement in config/jurisdiction_configs.py
	- [ ] Per-jurisdiction configuration
	- [ ] Meeting schedule patterns
	- [ ] Legislative body tracking
	```

	---

	## 📊 Comparison with Existing Integration

	\| Capability \| Original 5 Projects \| New 6 Projects \| Status \|
	\|------------\|-------------------\|---------------\|--------\|
	\| Platform detection \| ✅ Civic Scraper \| - \| Complete \|
	\| Event schema \| ✅ City Scrapers \| - \| Complete \|
	\| Video ingestion \| ✅ CDP \| ✅ LocalView (scale) \| Need scale patterns \|
	\| Matter tracking \| ✅ Engagic \| - \| Complete \|
	\| Person/vote tracking \| ✅ Councilmatic \| - \| Roadmapped \|
	\| AI Summarization \| ❌ \| ✅ OpenTowns, MeetingBank \| TODO: High priority \|
	\| Keyword Alerts \| ❌ \| ✅ OpenTowns \| TODO: High priority \|
	\| Cross-jurisdiction search \| ⚠️ Basic \| ✅ CivicBand \| TODO: Medium priority \|
	\| Quality metrics \| ❌ \| ✅ LocalView, MeetingBank \| TODO: Medium priority \|
	\| Batch processing \| ⚠️ Basic \| ✅ LocalView \| TODO: High priority \|

	---

	## 💻 Quick Start: Integrate Summarization

	Here's how to add OpenTowns-style summarization right now:

	```python
	# File: extraction/summarizer.py

	from openai import OpenAI
	from models.meeting_event import MeetingEvent
	from config.settings import settings

	client = OpenAI(api_key=settings.openai_api_key)

	def summarize_meeting(event: MeetingEvent, full_text: str) -> dict:
	"""
	Generate OpenTowns-style summary with oral health focus.
	"""
	prompt = f"""
	You are summarizing a local government meeting for public health advocates.

	Meeting: {event.title}
	Jurisdiction: {event.jurisdiction_name}, {event.state_code}
	Date: {event.start.strftime('%B %d, %Y')}

	Full text (first 8000 chars):
	{full_text[:8000]}

	Provide:
	1. Executive Summary (2-3 sentences)
	2. Key Decisions (bullet list)
	3. Oral Health Items (if any - fluoridation, dental access, etc.)
	4. Next Actions (follow-ups, future meetings)

	Focus on: What was decided? What's happening next?
	"""

	response = client.chat.completions.create(
	model="gpt-4o-mini",
	messages=[
	{"role": "system", "content": "You summarize local government meetings for public understanding."},
	{"role": "user", "content": prompt}
	],
	temperature=0.3
	)

	return {
	'summary': response.choices[0].message.content,
	'model': 'gpt-4o-mini',
	'tokens_used': response.usage.total_tokens
	}

	# Usage:
	# summary = summarize_meeting(event, full_transcript)
	# event.description = summary['summary']
	```

	---

	## 🎬 Next Steps

	1. Implement AI summarization (OpenTowns pattern) → Makes data usable
	2. Add keyword alerts (OpenTowns pattern) → Engage advocates
	3. Add batch processing (LocalView pattern) → Scale to 1,000+ jurisdictions
	4. Build search interface (CivicBand pattern) → User discovery
	5. Add quality metrics (LocalView + MeetingBank) → Monitor data health

	---

	## 📖 References

	- OpenTowns: https://opentowns.org
	- LocalView: https://www.localview.net
	- MeetingBank: https://meetingbank.github.io
	- CivicBand: https://civic.band
	- OpenCouncil: https://github.com/schemalabz/opencouncil
	- Council Data Project: https://councildataproject.org (see INTEGRATION_GUIDE.md)

	---

	## 📝 License & Attribution

	All patterns documented here are derived from open-source projects:
	- OpenTowns: Open civic-tech project
	- LocalView: Open-source (Harvard Mellon Urbanism)
	- MeetingBank: Open dataset
	- CivicBand: Open-source (Raft Foundation)
	- OpenCouncil: Open-source (MIT)
	- CDP: MIT License

	When using code patterns, maintain attribution per each project's license.