Spaces:
Sleeping
Sleeping
ADR-004: Redis Multi-Level Caching Strategy
Status
Accepted
Context
MediGuard AI performs many expensive operations:
- LLM API calls for analysis
- Vector searches in OpenSearch
- Complex biomarker calculations
- Repeated requests for similar data
Without caching, these operations would be repeated unnecessarily, leading to:
- Increased latency for users
- Higher costs from LLM API calls
- Unnecessary load on databases
- Poor user experience
Decision
Implement a multi-level caching strategy using Redis:
- L1 Cache (Memory): Fast, temporary cache for frequently accessed data
- L2 Cache (Redis): Persistent, distributed cache for longer-term storage
- Intelligent Promotion: Automatically promote L2 hits to L1
- Smart Invalidation: Cache invalidation based on data changes
- TTL Management: Different TTLs based on data type
Consequences
Positive
- Performance: Significant reduction in response times
- Cost Savings: Fewer LLM API calls
- Scalability: Better resource utilization
- User Experience: Faster responses for repeated queries
- Reliability: Graceful degradation when caches fail
Negative
- Complexity: Additional caching logic to maintain
- Memory Usage: L1 cache consumes application memory
- Stale Data: Risk of serving stale data if not invalidated properly
- Infrastructure: Requires Redis deployment and maintenance
Implementation
class CacheManager:
def __init__(self, l1_backend: CacheBackend, l2_backend: Optional[CacheBackend] = None):
self.l1 = l1_backend # Fast memory cache
self.l2 = l2_backend # Redis cache
async def get(self, key: str) -> Optional[Any]:
# Try L1 first
value = await self.l1.get(key)
if value is not None:
return value
# Try L2 and promote to L1
if self.l2:
value = await self.l2.get(key)
if value is not None:
await self.l1.set(key, value, ttl=l1_ttl)
return value
Cache decorators for automatic caching:
@cached(ttl=300, key_prefix="analysis:")
async def analyze_biomarkers(biomarkers: Dict[str, float]):
# Expensive analysis logic
pass
Notes
- L1 cache has a maximum size with LRU eviction
- L2 cache persists across application restarts
- Cache keys include version numbers for easy invalidation
- Monitoring tracks hit rates and performance metrics
- Cache warming strategies for frequently accessed data