Spaces:

CommunityOne
/

open-navigator

Running on CPU Upgrade

App Files Files Community

open-navigator / website /docs /deployment /databricks-migration.md

jcbowyer

Clean HuggingFace deployment without binary files

61d29fc 28 days ago

preview code

raw

history blame contribute delete

9.92 kB

	---
	sidebar_position: 2
	---

	# Databricks Agent Bricks Refactoring - Summary

	## What Was Done

	This system has been refactored to support Databricks Agent Bricks (Mosaic AI Agent Framework), enabling production-ready deployment on Databricks with full governance, monitoring, and scalability.

	## New Files Created

	### 1. Core Agent Infrastructure
	- `agents/mlflow_base.py` - MLflow Pyfunc base classes for agents
	- `MLflowAgentBase`: Base class with tracing, Model Serving compatibility
	- `MLflowChainAgent`: LangChain integration with automatic logging
	- Automatic signature inference
	- Unity Catalog registration methods
	- Model Serving deployment helpers

	- `agents/mlflow_classifier.py` - Production classifier agent
	- Hybrid keyword + LLM classification
	- MLflow tracing for all calls
	- Unity Catalog ready
	- Can be deployed to Model Serving
	- Includes registration script

	### 2. Deployment & Operations
	- `databricks/deployment.py` - Deployment automation
	- `AgentDeploymentManager` class
	- Register agents to Unity Catalog
	- Deploy to Model Serving endpoints
	- Multi-agent endpoints with traffic splitting
	- Endpoint testing and monitoring
	- Auto-scaling configuration

	- `databricks/evaluation.py` - Quality assurance
	- `AgentEvaluator` class
	- Automated evaluation pipelines
	- A/B testing between versions
	- Metrics: accuracy, precision, recall, F1, latency
	- Confusion matrix generation
	- Feedback loop integration with Delta Lake

	### 3. Interactive Development
	- `databricks/notebooks/01_agent_bricks_quickstart.py` - Databricks notebook
	- Step-by-step deployment guide
	- Local testing examples
	- Unity Catalog registration
	- Model Serving deployment
	- Evaluation examples
	- Delta Lake queries
	- Monitoring and observability

	- `databricks/README.md` - Comprehensive documentation
	- Architecture diagrams
	- Deployment workflows
	- API usage examples
	- Cost considerations
	- Troubleshooting guide
	- Best practices

	### 4. Dependencies
	- Updated `requirements-cpu.txt` with:
	- `mlflow>=2.10.0` - MLflow tracking and serving
	- `databricks-agents>=0.1.0` - Agent Framework
	- `databricks-vectorsearch>=0.22.0` - Vector search
	- `langgraph>=0.0.20` - Stateful agent graphs
	- `databricks-sdk>=0.18.0` - Databricks API client

	### 5. Updated Existing Files
	- `README.md` - Added Databricks Agent Bricks section
	- `install.sh` - Detects and uses `requirements-cpu.txt`

	## Key Features Added

	### 1. MLflow Integration
	✅ Automatic tracing of all agent calls
	✅ LLM request/response logging
	✅ Metrics tracking (latency, tokens, cost)
	✅ Experiment tracking and versioning
	✅ Model registry integration

	### 2. Unity Catalog Governance
	✅ Centralized model registration
	✅ Permissions and access control
	✅ Data lineage tracking
	✅ Version management
	✅ Tag-based organization

	### 3. Model Serving
	✅ REST API endpoints
	✅ Auto-scaling (scale-to-zero capable)
	✅ A/B testing with traffic splitting
	✅ Multi-agent pipelines
	✅ Monitoring and alerting

	### 4. Evaluation Framework
	✅ Automated quality metrics
	✅ Regression detection
	✅ Version comparison
	✅ Confusion matrices
	✅ Feedback loop from production

	### 5. Production Ready
	✅ CPU-only compatibility (no GPU needed)
	✅ Enterprise monitoring
	✅ Cost optimization (keyword filtering before LLM)
	✅ Error handling and retries
	✅ Comprehensive logging

	## Deployment Architecture

	```
	┌──────────────────────────────────────────────────────────┐
	│ Databricks Workspace │
	│ │
	│ ┌──────────────────┐ ┌────────────────────┐ │
	│ │ Unity Catalog │◄─────┤ MLflow Tracking │ │
	│ │ - Policy Class. │ │ - Experiments │ │
	│ │ - Sentiment An. │ │ - Traces │ │
	│ │ - Advocacy Gen. │ │ - Metrics │ │
	│ └────────┬─────────┘ └────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌──────────────────────────────────────────┐ │
	│ │ Model Serving Endpoints │ │
	│ │ ┌────────────┐ ┌─────────────────────┐ │ │
	│ │ │ Classifier │ │ Sentiment Analyzer │ │ │
	│ │ │ (Small) │ │ (Small) │ │ │
	│ │ │ Scale-to-0 │ │ Scale-to-0 │ │ │
	│ │ └────────────┘ └─────────────────────┘ │ │
	│ └─────────────┬────────────────────────────┘ │
	│ │ │
	└────────────────┼──────────────────────────────────────────┘
	│
	▼
	┌────────────────┐
	│ External API │
	│ FastAPI App │
	└────────────────┘
	```

	## Usage Examples

	### Register Agent to Unity Catalog
	```python
	from agents.mlflow_classifier import PolicyClassifierAgent
	from databricks.deployment import AgentDeploymentManager

	manager = AgentDeploymentManager()

	version = manager.register_agent(
	agent_class=PolicyClassifierAgent,
	agent_name="policy_classifier",
	description="Classifies documents for oral health topics",
	tags={"team": "advocacy"}
	)
	```

	### Deploy to Model Serving
	```python
	endpoint_url = manager.deploy_agent(
	agent_name="policy_classifier",
	endpoint_name="policy-classifier-prod",
	workload_size="Small",
	scale_to_zero=True
	)
	```

	### Evaluate Agent
	```python
	from databricks.evaluation import AgentEvaluator

	evaluator = AgentEvaluator("policy_classifier")

	metrics = evaluator.evaluate_classifier(
	model_uri="models:/main.agents.policy_classifier/1",
	test_documents=test_docs,
	ground_truth=labels
	)

	print(f"Accuracy: {metrics.accuracy:.2%}")
	```

	### Invoke via API
	```bash
	curl -X POST https://workspace.cloud.databricks.com/serving-endpoints/policy-classifier-prod/invocations \
	-H "Authorization: Bearer $DATABRICKS_TOKEN" \
	-H "Content-Type: application/json" \
	-d '{
	"dataframe_records": [{
	"document_id": "doc_001",
	"title": "Meeting",
	"content": "Fluoride discussion..."
	}]
	}'
	```

	## Benefits

	### Before (Custom Implementation)
	- ❌ Manual deployment and versioning
	- ❌ No built-in observability
	- ❌ Limited scalability
	- ❌ No governance or lineage
	- ❌ Manual evaluation pipelines
	- ❌ Complex monitoring setup

	### After (Databricks Agent Bricks)
	- ✅ One-command deployment
	- ✅ Automatic tracing and logging
	- ✅ Auto-scaling Model Serving
	- ✅ Unity Catalog governance
	- ✅ Built-in evaluation framework
	- ✅ Enterprise monitoring included

	## Cost Optimization

	The refactored system includes several cost optimizations:

	1. Hybrid Classification: Uses keyword matching before expensive LLM calls
	2. Scale-to-Zero: Endpoints scale down when idle
	3. Batch Processing: Supports bulk document classification
	4. Caching: Frequently requested results can be cached
	5. Small Workloads: Starts with small endpoints, scales on demand

	Estimated cost: ~$0.10-0.50/hour for active endpoints (much less with scale-to-zero)

	## Next Steps

	1. Deploy to Databricks:
	```bash
	python -m databricks.deployment
	```

	2. Run Evaluation:
	```bash
	python -m databricks.evaluation
	```

	3. Test in Notebook: Open `databricks/notebooks/01_agent_bricks_quickstart.py`

	4. Monitor Production: Set up alerts in Databricks UI

	5. Add Feedback Loop: Collect corrections and retrain

	## Migration Path

	For existing users:

	1. ✅ Standalone mode still works - No breaking changes to existing code
	2. 🔄 Gradual migration - Can use both modes simultaneously
	3. ☁️ Databricks optional - Only needed for production scale
	4. 🎯 Choose your path:
	- Small projects: Use standalone mode
	- Production/Enterprise: Use Databricks Agent Bricks

	## Questions?

	- See [`databricks/README.md`](databricks/README.md) for detailed docs
	- Run `databricks/notebooks/01_agent_bricks_quickstart.py` for hands-on tutorial
	- Check examples in `databricks/deployment.py` and `databricks/evaluation.py`

	## Summary

	This refactoring transforms the Oral Health Policy Pulse from a standalone multi-agent system into a production-ready, enterprise-grade application that leverages Databricks' full stack for AI governance, deployment, and monitoring. The system now has:

	- 🏢 Enterprise deployment via Model Serving
	- 📊 Automatic observability with MLflow tracing
	- 🔐 Data governance through Unity Catalog
	- 📈 Quality assurance with evaluation framework
	- 💰 Cost optimization with scale-to-zero and hybrid approach
	- 🚀 Production readiness out of the box

	All while maintaining backward compatibility with the standalone mode! 🎉