Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
File size: 10,630 Bytes
61d29fc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 | ---
sidebar_position: 1
---
# Databricks Apps Deployment Guide
## Overview
The Oral Health Policy Pulse application has been refactored as a **React + FastAPI Databricks App**, providing:
- π¨ **Modern React UI** with TypeScript, Tailwind CSS, and interactive visualizations
- β‘ **FastAPI Backend** serving both API and static frontend
- βοΈ **Databricks Apps Deployment** with Unity Catalog integration
- π **Enterprise Security** via Databricks secrets and authentication
- π **Real-time Analytics** powered by Delta Lake and Model Serving
---
## Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Databricks Workspace β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Databricks App (This Application) β β
β β β β
β β ββββββββββββββββ ββββββββββββββββββββ β β
β β β React β ββββΆ β FastAPI Backend β β β
β β β Frontend β β β β β
β β β (TypeScript) β β β’ REST API β β β
β β β β β β’ Static Serving β β β
β β ββββββββββββββββ ββββββββββββββββββββ β β
β β β β β
β βββββββββββββββββββββββββββββββββΌββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββΌββββββββββββββββββ β
β β Unity Catalog & Delta Lake β β
β β β’ raw_documents β β
β β β’ classified_documents β β
β β β’ advocacy_opportunities β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Model Serving Endpoints β β
β β β’ policy-classifier-prod β β
β β β’ sentiment-analyzer-prod β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## Prerequisites
### 1. Databricks Workspace
- Databricks Runtime 14.3 LTS ML or higher
- Unity Catalog enabled
- Model Serving enabled
### 2. Local Development Tools
- Python 3.11+
- Node.js 20+
- Databricks CLI
### 3. API Keys
- OpenAI API key (for LLM-based classification)
- Anthropic API key (optional)
---
## Local Development
### Setup
```bash
# Clone and setup
cd open-navigator
./scripts/setup-local.sh
```
### Run Development Server
**Option 1: Separate Frontend + Backend (Hot Reload)**
```bash
# Terminal 1 - Backend
source venv/bin/activate
uvicorn api.app:app --reload
# Terminal 2 - Frontend
cd frontend
npm run dev
```
Visit: http://localhost:3000
**Option 2: Production Mode (Serves Built Frontend)**
```bash
# Build frontend first
cd frontend
npm run build
cd ..
# Run app
source venv/bin/activate
python scripts/test-app.py
```
Visit: http://localhost:8000
---
## Deploying to Databricks Apps
### Step 1: Configure Environment
```bash
# Set Databricks credentials
export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi1234567890abcdef
# Set API keys
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
```
### Step 2: Deploy
```bash
# One-command deployment
./scripts/deploy-databricks-app.sh
```
This script will:
1. β
Build React frontend (optimized production build)
2. β
Create Databricks secrets for credentials
3. β
Deploy app to Databricks Apps
4. β
Configure Model Serving endpoints
5. β
Provide access URL
### Step 3: Access Your App
Once deployed, access at:
```
https://your-workspace.cloud.databricks.com/apps/open-navigator
```
---
## Configuration
### app.yaml
The main configuration file for Databricks Apps:
```yaml
command:
- "uvicorn"
- "api.app:app"
- "--host"
- "0.0.0.0"
- "--port"
- "8000"
env:
- name: DATABRICKS_HOST
valueFrom:
databricksSecret:
key: host
scope: oral-health-app
- name: OPENAI_API_KEY
valueFrom:
databricksSecret:
key: openai_key
scope: oral-health-app
resources:
- name: policy-classifier-endpoint
modelServing:
endpoint: policy-classifier-prod
port: 8000
```
### Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `DATABRICKS_HOST` | Workspace URL | Yes |
| `DATABRICKS_TOKEN` | Access token | Yes |
| `DATABRICKS_WAREHOUSE_ID` | SQL warehouse ID | Yes |
| `OPENAI_API_KEY` | OpenAI API key | Yes |
| `ANTHROPIC_API_KEY` | Anthropic API key | No |
| `CATALOG_NAME` | Unity Catalog name | Yes |
| `SCHEMA_NAME` | Schema name | Yes |
---
## Frontend Features
### Dashboard
- π Real-time statistics
- π Topic distribution charts
- π Recent opportunities list
### Interactive Heatmap
- πΊοΈ Geographic visualization
- π― Filterable by state, topic, urgency
- π‘ Click markers for details
### Documents Browser
- π Full-text search
- π Paginated results
- π·οΈ Topic tags
### Opportunities Manager
- π¨ Urgency-based filtering
- βοΈ One-click email generation
- π
Meeting calendar integration
### Settings Panel
- βοΈ Configure target states
- π Select policy topics
- π Email notifications
- π Agent status monitoring
---
## API Endpoints
### Core Endpoints
```
GET /api/health - Health check
GET /api/dashboard - Dashboard statistics
GET /api/opportunities - List opportunities (filterable)
GET /api/documents - List documents (searchable)
POST /api/workflow/start - Start analysis workflow
GET /api/workflow/{id}/status - Check workflow status
POST /api/advocacy/email/{id} - Generate advocacy email
GET /api/settings - Get settings
PUT /api/settings - Update settings
GET /api/agents/status - Agent health status
```
### API Documentation
Once deployed, access interactive API docs at:
- Swagger UI: `https://your-app-url/api/docs`
- ReDoc: `https://your-app-url/api/redoc`
---
## Monitoring
### View App Logs
```bash
databricks apps logs open-navigator
```
### Check App Status
```bash
databricks apps get open-navigator
```
### Monitor Model Serving
```bash
databricks serving-endpoints get policy-classifier-prod
databricks serving-endpoints get sentiment-analyzer-prod
```
---
## Troubleshooting
### Issue: "Frontend not built"
**Solution:**
```bash
cd frontend
npm install
npm run build
```
### Issue: "Databricks CLI not found"
**Solution:**
```bash
pip install databricks-cli
databricks configure --token
```
### Issue: "Secrets not accessible"
**Solution:**
```bash
# Recreate secrets scope
databricks secrets create-scope --scope oral-health-app
databricks secrets put --scope oral-health-app --key host --string-value "$DATABRICKS_HOST"
databricks secrets put --scope oral-health-app --key openai_key --string-value "$OPENAI_API_KEY"
```
### Issue: "App deployment failed"
**Check logs:**
```bash
databricks apps logs open-navigator --follow
```
---
## Cost Optimization
### Databricks Apps Pricing
| Component | Cost Estimate |
|-----------|---------------|
| App hosting | $0.10-0.30/hour |
| Model Serving (Small) | $0.10-0.50/hour |
| Delta Lake storage | $0.023/GB/month |
| SQL Warehouse | Pay per query |
**Total estimated cost:** ~$50-150/month for moderate usage
### Cost Savings Tips
1. **Scale-to-zero**: Model serving endpoints automatically scale down when idle
2. **Batch processing**: Process documents in batches rather than real-time
3. **Hybrid classification**: Use keyword matching before LLM calls (saves ~70% LLM costs)
4. **Delta Lake optimization**: Enable auto-compaction and Z-ordering
---
## Security
### Authentication
Databricks Apps automatically integrate with:
- β
Workspace SSO
- β
OAuth 2.0
- β
SCIM user provisioning
### Data Access
All data access is governed by:
- β
Unity Catalog permissions
- β
Row-level security
- β
Column-level masking
### Secrets Management
Sensitive credentials stored in:
- β
Databricks Secrets
- β
Never in code or logs
- β
Automatic rotation support
---
## Next Steps
1. **Deploy Model Serving Endpoints**
```bash
python -m databricks.deployment
```
2. **Initialize Delta Lake Tables**
```sql
-- Run in Databricks SQL
CREATE SCHEMA IF NOT EXISTS main.agents;
```
3. **Start Data Ingestion**
- Configure target municipalities
- Run initial scraping workflow
- Monitor agent status
4. **Customize UI**
- Edit frontend components in `frontend/src/`
- Rebuild: `npm run build`
- Redeploy: `./scripts/deploy-databricks-app.sh`
---
## Support
- π **Documentation**: See README.md and DATABRICKS_MIGRATION.md
- π **Issues**: Report via GitHub Issues
- π¬ **Community**: Join discussions
---
## License
MIT License - See LICENSE file for details
|