open-navigator / website /docs /deployment /databricks-apps.md
jcbowyer's picture
Clean HuggingFace deployment without binary files
61d29fc
---
sidebar_position: 1
---
# Databricks Apps Deployment Guide
## Overview
The Oral Health Policy Pulse application has been refactored as a **React + FastAPI Databricks App**, providing:
- 🎨 **Modern React UI** with TypeScript, Tailwind CSS, and interactive visualizations
- ⚑ **FastAPI Backend** serving both API and static frontend
- ☁️ **Databricks Apps Deployment** with Unity Catalog integration
- πŸ”’ **Enterprise Security** via Databricks secrets and authentication
- πŸ“Š **Real-time Analytics** powered by Delta Lake and Model Serving
---
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Databricks Workspace β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Databricks App (This Application) β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ β”‚ β”‚ React β”‚ ───▢ β”‚ FastAPI Backend β”‚ β”‚ β”‚
β”‚ β”‚ β”‚ Frontend β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ β”‚ (TypeScript) β”‚ β”‚ β€’ REST API β”‚ β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β€’ Static Serving β”‚ β”‚ β”‚
β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Unity Catalog & Delta Lake β”‚ β”‚
β”‚ β”‚ β€’ raw_documents β”‚ β”‚
β”‚ β”‚ β€’ classified_documents β”‚ β”‚
β”‚ β”‚ β€’ advocacy_opportunities β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Model Serving Endpoints β”‚ β”‚
β”‚ β”‚ β€’ policy-classifier-prod β”‚ β”‚
β”‚ β”‚ β€’ sentiment-analyzer-prod β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## Prerequisites
### 1. Databricks Workspace
- Databricks Runtime 14.3 LTS ML or higher
- Unity Catalog enabled
- Model Serving enabled
### 2. Local Development Tools
- Python 3.11+
- Node.js 20+
- Databricks CLI
### 3. API Keys
- OpenAI API key (for LLM-based classification)
- Anthropic API key (optional)
---
## Local Development
### Setup
```bash
# Clone and setup
cd open-navigator
./scripts/setup-local.sh
```
### Run Development Server
**Option 1: Separate Frontend + Backend (Hot Reload)**
```bash
# Terminal 1 - Backend
source venv/bin/activate
uvicorn api.app:app --reload
# Terminal 2 - Frontend
cd frontend
npm run dev
```
Visit: http://localhost:3000
**Option 2: Production Mode (Serves Built Frontend)**
```bash
# Build frontend first
cd frontend
npm run build
cd ..
# Run app
source venv/bin/activate
python scripts/test-app.py
```
Visit: http://localhost:8000
---
## Deploying to Databricks Apps
### Step 1: Configure Environment
```bash
# Set Databricks credentials
export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapi1234567890abcdef
# Set API keys
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
```
### Step 2: Deploy
```bash
# One-command deployment
./scripts/deploy-databricks-app.sh
```
This script will:
1. βœ… Build React frontend (optimized production build)
2. βœ… Create Databricks secrets for credentials
3. βœ… Deploy app to Databricks Apps
4. βœ… Configure Model Serving endpoints
5. βœ… Provide access URL
### Step 3: Access Your App
Once deployed, access at:
```
https://your-workspace.cloud.databricks.com/apps/open-navigator
```
---
## Configuration
### app.yaml
The main configuration file for Databricks Apps:
```yaml
command:
- "uvicorn"
- "api.app:app"
- "--host"
- "0.0.0.0"
- "--port"
- "8000"
env:
- name: DATABRICKS_HOST
valueFrom:
databricksSecret:
key: host
scope: oral-health-app
- name: OPENAI_API_KEY
valueFrom:
databricksSecret:
key: openai_key
scope: oral-health-app
resources:
- name: policy-classifier-endpoint
modelServing:
endpoint: policy-classifier-prod
port: 8000
```
### Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `DATABRICKS_HOST` | Workspace URL | Yes |
| `DATABRICKS_TOKEN` | Access token | Yes |
| `DATABRICKS_WAREHOUSE_ID` | SQL warehouse ID | Yes |
| `OPENAI_API_KEY` | OpenAI API key | Yes |
| `ANTHROPIC_API_KEY` | Anthropic API key | No |
| `CATALOG_NAME` | Unity Catalog name | Yes |
| `SCHEMA_NAME` | Schema name | Yes |
---
## Frontend Features
### Dashboard
- πŸ“Š Real-time statistics
- πŸ“ˆ Topic distribution charts
- πŸ“‹ Recent opportunities list
### Interactive Heatmap
- πŸ—ΊοΈ Geographic visualization
- 🎯 Filterable by state, topic, urgency
- πŸ’‘ Click markers for details
### Documents Browser
- πŸ” Full-text search
- πŸ“„ Paginated results
- 🏷️ Topic tags
### Opportunities Manager
- 🚨 Urgency-based filtering
- βœ‰οΈ One-click email generation
- πŸ“… Meeting calendar integration
### Settings Panel
- βš™οΈ Configure target states
- πŸ“ Select policy topics
- πŸ”” Email notifications
- πŸ“Š Agent status monitoring
---
## API Endpoints
### Core Endpoints
```
GET /api/health - Health check
GET /api/dashboard - Dashboard statistics
GET /api/opportunities - List opportunities (filterable)
GET /api/documents - List documents (searchable)
POST /api/workflow/start - Start analysis workflow
GET /api/workflow/{id}/status - Check workflow status
POST /api/advocacy/email/{id} - Generate advocacy email
GET /api/settings - Get settings
PUT /api/settings - Update settings
GET /api/agents/status - Agent health status
```
### API Documentation
Once deployed, access interactive API docs at:
- Swagger UI: `https://your-app-url/api/docs`
- ReDoc: `https://your-app-url/api/redoc`
---
## Monitoring
### View App Logs
```bash
databricks apps logs open-navigator
```
### Check App Status
```bash
databricks apps get open-navigator
```
### Monitor Model Serving
```bash
databricks serving-endpoints get policy-classifier-prod
databricks serving-endpoints get sentiment-analyzer-prod
```
---
## Troubleshooting
### Issue: "Frontend not built"
**Solution:**
```bash
cd frontend
npm install
npm run build
```
### Issue: "Databricks CLI not found"
**Solution:**
```bash
pip install databricks-cli
databricks configure --token
```
### Issue: "Secrets not accessible"
**Solution:**
```bash
# Recreate secrets scope
databricks secrets create-scope --scope oral-health-app
databricks secrets put --scope oral-health-app --key host --string-value "$DATABRICKS_HOST"
databricks secrets put --scope oral-health-app --key openai_key --string-value "$OPENAI_API_KEY"
```
### Issue: "App deployment failed"
**Check logs:**
```bash
databricks apps logs open-navigator --follow
```
---
## Cost Optimization
### Databricks Apps Pricing
| Component | Cost Estimate |
|-----------|---------------|
| App hosting | $0.10-0.30/hour |
| Model Serving (Small) | $0.10-0.50/hour |
| Delta Lake storage | $0.023/GB/month |
| SQL Warehouse | Pay per query |
**Total estimated cost:** ~$50-150/month for moderate usage
### Cost Savings Tips
1. **Scale-to-zero**: Model serving endpoints automatically scale down when idle
2. **Batch processing**: Process documents in batches rather than real-time
3. **Hybrid classification**: Use keyword matching before LLM calls (saves ~70% LLM costs)
4. **Delta Lake optimization**: Enable auto-compaction and Z-ordering
---
## Security
### Authentication
Databricks Apps automatically integrate with:
- βœ… Workspace SSO
- βœ… OAuth 2.0
- βœ… SCIM user provisioning
### Data Access
All data access is governed by:
- βœ… Unity Catalog permissions
- βœ… Row-level security
- βœ… Column-level masking
### Secrets Management
Sensitive credentials stored in:
- βœ… Databricks Secrets
- βœ… Never in code or logs
- βœ… Automatic rotation support
---
## Next Steps
1. **Deploy Model Serving Endpoints**
```bash
python -m databricks.deployment
```
2. **Initialize Delta Lake Tables**
```sql
-- Run in Databricks SQL
CREATE SCHEMA IF NOT EXISTS main.agents;
```
3. **Start Data Ingestion**
- Configure target municipalities
- Run initial scraping workflow
- Monitor agent status
4. **Customize UI**
- Edit frontend components in `frontend/src/`
- Rebuild: `npm run build`
- Redeploy: `./scripts/deploy-databricks-app.sh`
---
## Support
- πŸ“– **Documentation**: See README.md and DATABRICKS_MIGRATION.md
- πŸ› **Issues**: Report via GitHub Issues
- πŸ’¬ **Community**: Join discussions
---
## License
MIT License - See LICENSE file for details