Spaces:
Running on CPU Upgrade
sidebar_position: 2
Databricks Agent Bricks Refactoring - Summary
What Was Done
This system has been refactored to support Databricks Agent Bricks (Mosaic AI Agent Framework), enabling production-ready deployment on Databricks with full governance, monitoring, and scalability.
New Files Created
1. Core Agent Infrastructure
agents/mlflow_base.py- MLflow Pyfunc base classes for agentsMLflowAgentBase: Base class with tracing, Model Serving compatibilityMLflowChainAgent: LangChain integration with automatic logging- Automatic signature inference
- Unity Catalog registration methods
- Model Serving deployment helpers
agents/mlflow_classifier.py- Production classifier agent- Hybrid keyword + LLM classification
- MLflow tracing for all calls
- Unity Catalog ready
- Can be deployed to Model Serving
- Includes registration script
2. Deployment & Operations
databricks/deployment.py- Deployment automationAgentDeploymentManagerclass- Register agents to Unity Catalog
- Deploy to Model Serving endpoints
- Multi-agent endpoints with traffic splitting
- Endpoint testing and monitoring
- Auto-scaling configuration
databricks/evaluation.py- Quality assuranceAgentEvaluatorclass- Automated evaluation pipelines
- A/B testing between versions
- Metrics: accuracy, precision, recall, F1, latency
- Confusion matrix generation
- Feedback loop integration with Delta Lake
3. Interactive Development
databricks/notebooks/01_agent_bricks_quickstart.py- Databricks notebook- Step-by-step deployment guide
- Local testing examples
- Unity Catalog registration
- Model Serving deployment
- Evaluation examples
- Delta Lake queries
- Monitoring and observability
databricks/README.md- Comprehensive documentation- Architecture diagrams
- Deployment workflows
- API usage examples
- Cost considerations
- Troubleshooting guide
- Best practices
4. Dependencies
- Updated
requirements-cpu.txtwith:mlflow>=2.10.0- MLflow tracking and servingdatabricks-agents>=0.1.0- Agent Frameworkdatabricks-vectorsearch>=0.22.0- Vector searchlanggraph>=0.0.20- Stateful agent graphsdatabricks-sdk>=0.18.0- Databricks API client
5. Updated Existing Files
README.md- Added Databricks Agent Bricks sectioninstall.sh- Detects and usesrequirements-cpu.txt
Key Features Added
1. MLflow Integration
β Automatic tracing of all agent calls β LLM request/response logging β Metrics tracking (latency, tokens, cost) β Experiment tracking and versioning β Model registry integration
2. Unity Catalog Governance
β Centralized model registration β Permissions and access control β Data lineage tracking β Version management β Tag-based organization
3. Model Serving
β REST API endpoints β Auto-scaling (scale-to-zero capable) β A/B testing with traffic splitting β Multi-agent pipelines β Monitoring and alerting
4. Evaluation Framework
β Automated quality metrics β Regression detection β Version comparison β Confusion matrices β Feedback loop from production
5. Production Ready
β CPU-only compatibility (no GPU needed) β Enterprise monitoring β Cost optimization (keyword filtering before LLM) β Error handling and retries β Comprehensive logging
Deployment Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Databricks Workspace β
β β
β ββββββββββββββββββββ ββββββββββββββββββββββ β
β β Unity Catalog ββββββββ€ MLflow Tracking β β
β β - Policy Class. β β - Experiments β β
β β - Sentiment An. β β - Traces β β
β β - Advocacy Gen. β β - Metrics β β
β ββββββββββ¬ββββββββββ ββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β Model Serving Endpoints β β
β β ββββββββββββββ βββββββββββββββββββββββ β β
β β β Classifier β β Sentiment Analyzer β β β
β β β (Small) β β (Small) β β β
β β β Scale-to-0 β β Scale-to-0 β β β
β β ββββββββββββββ βββββββββββββββββββββββ β β
β βββββββββββββββ¬βββββββββββββββββββββββββββββ β
β β β
ββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββ
β External API β
β FastAPI App β
ββββββββββββββββββ
Usage Examples
Register Agent to Unity Catalog
from agents.mlflow_classifier import PolicyClassifierAgent
from databricks.deployment import AgentDeploymentManager
manager = AgentDeploymentManager()
version = manager.register_agent(
agent_class=PolicyClassifierAgent,
agent_name="policy_classifier",
description="Classifies documents for oral health topics",
tags={"team": "advocacy"}
)
Deploy to Model Serving
endpoint_url = manager.deploy_agent(
agent_name="policy_classifier",
endpoint_name="policy-classifier-prod",
workload_size="Small",
scale_to_zero=True
)
Evaluate Agent
from databricks.evaluation import AgentEvaluator
evaluator = AgentEvaluator("policy_classifier")
metrics = evaluator.evaluate_classifier(
model_uri="models:/main.agents.policy_classifier/1",
test_documents=test_docs,
ground_truth=labels
)
print(f"Accuracy: {metrics.accuracy:.2%}")
Invoke via API
curl -X POST https://workspace.cloud.databricks.com/serving-endpoints/policy-classifier-prod/invocations \
-H "Authorization: Bearer $DATABRICKS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"dataframe_records": [{
"document_id": "doc_001",
"title": "Meeting",
"content": "Fluoride discussion..."
}]
}'
Benefits
Before (Custom Implementation)
- β Manual deployment and versioning
- β No built-in observability
- β Limited scalability
- β No governance or lineage
- β Manual evaluation pipelines
- β Complex monitoring setup
After (Databricks Agent Bricks)
- β One-command deployment
- β Automatic tracing and logging
- β Auto-scaling Model Serving
- β Unity Catalog governance
- β Built-in evaluation framework
- β Enterprise monitoring included
Cost Optimization
The refactored system includes several cost optimizations:
- Hybrid Classification: Uses keyword matching before expensive LLM calls
- Scale-to-Zero: Endpoints scale down when idle
- Batch Processing: Supports bulk document classification
- Caching: Frequently requested results can be cached
- Small Workloads: Starts with small endpoints, scales on demand
Estimated cost: ~$0.10-0.50/hour for active endpoints (much less with scale-to-zero)
Next Steps
Deploy to Databricks:
python -m databricks.deploymentRun Evaluation:
python -m databricks.evaluationTest in Notebook: Open
databricks/notebooks/01_agent_bricks_quickstart.pyMonitor Production: Set up alerts in Databricks UI
Add Feedback Loop: Collect corrections and retrain
Migration Path
For existing users:
- β Standalone mode still works - No breaking changes to existing code
- π Gradual migration - Can use both modes simultaneously
- βοΈ Databricks optional - Only needed for production scale
- π― Choose your path:
- Small projects: Use standalone mode
- Production/Enterprise: Use Databricks Agent Bricks
Questions?
- See
databricks/README.mdfor detailed docs - Run
databricks/notebooks/01_agent_bricks_quickstart.pyfor hands-on tutorial - Check examples in
databricks/deployment.pyanddatabricks/evaluation.py
Summary
This refactoring transforms the Oral Health Policy Pulse from a standalone multi-agent system into a production-ready, enterprise-grade application that leverages Databricks' full stack for AI governance, deployment, and monitoring. The system now has:
- π’ Enterprise deployment via Model Serving
- π Automatic observability with MLflow tracing
- π Data governance through Unity Catalog
- π Quality assurance with evaluation framework
- π° Cost optimization with scale-to-zero and hybrid approach
- π Production readiness out of the box
All while maintaining backward compatibility with the standalone mode! π