Agentic-RagBot / docs /adr /004-redis-caching-strategy.md
MediGuard AI
feat: Initial release of MediGuard AI v2.0
c4f5f25
# ADR-004: Redis Multi-Level Caching Strategy
## Status
Accepted
## Context
MediGuard AI performs many expensive operations:
- LLM API calls for analysis
- Vector searches in OpenSearch
- Complex biomarker calculations
- Repeated requests for similar data
Without caching, these operations would be repeated unnecessarily, leading to:
- Increased latency for users
- Higher costs from LLM API calls
- Unnecessary load on databases
- Poor user experience
## Decision
Implement a multi-level caching strategy using Redis:
1. **L1 Cache (Memory)**: Fast, temporary cache for frequently accessed data
2. **L2 Cache (Redis)**: Persistent, distributed cache for longer-term storage
3. **Intelligent Promotion**: Automatically promote L2 hits to L1
4. **Smart Invalidation**: Cache invalidation based on data changes
5. **TTL Management**: Different TTLs based on data type
## Consequences
### Positive
- **Performance**: Significant reduction in response times
- **Cost Savings**: Fewer LLM API calls
- **Scalability**: Better resource utilization
- **User Experience**: Faster responses for repeated queries
- **Reliability**: Graceful degradation when caches fail
### Negative
- **Complexity**: Additional caching logic to maintain
- **Memory Usage**: L1 cache consumes application memory
- **Stale Data**: Risk of serving stale data if not invalidated properly
- **Infrastructure**: Requires Redis deployment and maintenance
## Implementation
```python
class CacheManager:
def __init__(self, l1_backend: CacheBackend, l2_backend: Optional[CacheBackend] = None):
self.l1 = l1_backend # Fast memory cache
self.l2 = l2_backend # Redis cache
async def get(self, key: str) -> Optional[Any]:
# Try L1 first
value = await self.l1.get(key)
if value is not None:
return value
# Try L2 and promote to L1
if self.l2:
value = await self.l2.get(key)
if value is not None:
await self.l1.set(key, value, ttl=l1_ttl)
return value
```
Cache decorators for automatic caching:
```python
@cached(ttl=300, key_prefix="analysis:")
async def analyze_biomarkers(biomarkers: Dict[str, float]):
# Expensive analysis logic
pass
```
## Notes
- L1 cache has a maximum size with LRU eviction
- L2 cache persists across application restarts
- Cache keys include version numbers for easy invalidation
- Monitoring tracks hit rates and performance metrics
- Cache warming strategies for frequently accessed data