Spaces:

T0X1N
/

Agentic-RagBot

Sleeping

File size: 2,550 Bytes

c4f5f25

# ADR-004: Redis Multi-Level Caching Strategy

## Status
Accepted

## Context
MediGuard AI performs many expensive operations:
- LLM API calls for analysis
- Vector searches in OpenSearch
- Complex biomarker calculations
- Repeated requests for similar data

Without caching, these operations would be repeated unnecessarily, leading to:
- Increased latency for users
- Higher costs from LLM API calls
- Unnecessary load on databases
- Poor user experience

## Decision
Implement a multi-level caching strategy using Redis:
1. **L1 Cache (Memory)**: Fast, temporary cache for frequently accessed data
2. **L2 Cache (Redis)**: Persistent, distributed cache for longer-term storage
3. **Intelligent Promotion**: Automatically promote L2 hits to L1
4. **Smart Invalidation**: Cache invalidation based on data changes
5. **TTL Management**: Different TTLs based on data type

## Consequences

### Positive
- **Performance**: Significant reduction in response times
- **Cost Savings**: Fewer LLM API calls
- **Scalability**: Better resource utilization
- **User Experience**: Faster responses for repeated queries
- **Reliability**: Graceful degradation when caches fail

### Negative
- **Complexity**: Additional caching logic to maintain
- **Memory Usage**: L1 cache consumes application memory
- **Stale Data**: Risk of serving stale data if not invalidated properly
- **Infrastructure**: Requires Redis deployment and maintenance

## Implementation
```python
class CacheManager:
    def __init__(self, l1_backend: CacheBackend, l2_backend: Optional[CacheBackend] = None):
        self.l1 = l1_backend  # Fast memory cache
        self.l2 = l2_backend  # Redis cache
        
    async def get(self, key: str) -> Optional[Any]:
        # Try L1 first
        value = await self.l1.get(key)
        if value is not None:
            return value
            
        # Try L2 and promote to L1
        if self.l2:
            value = await self.l2.get(key)
            if value is not None:
                await self.l1.set(key, value, ttl=l1_ttl)
                return value
```

Cache decorators for automatic caching:
```python
@cached(ttl=300, key_prefix="analysis:")
async def analyze_biomarkers(biomarkers: Dict[str, float]):
    # Expensive analysis logic
    pass
```

## Notes
- L1 cache has a maximum size with LRU eviction
- L2 cache persists across application restarts
- Cache keys include version numbers for easy invalidation
- Monitoring tracks hit rates and performance metrics
- Cache warming strategies for frequently accessed data