Agentic-RagBot / docs /adr /004-redis-caching-strategy.md
MediGuard AI
feat: Initial release of MediGuard AI v2.0
c4f5f25

ADR-004: Redis Multi-Level Caching Strategy

Status

Accepted

Context

MediGuard AI performs many expensive operations:

  • LLM API calls for analysis
  • Vector searches in OpenSearch
  • Complex biomarker calculations
  • Repeated requests for similar data

Without caching, these operations would be repeated unnecessarily, leading to:

  • Increased latency for users
  • Higher costs from LLM API calls
  • Unnecessary load on databases
  • Poor user experience

Decision

Implement a multi-level caching strategy using Redis:

  1. L1 Cache (Memory): Fast, temporary cache for frequently accessed data
  2. L2 Cache (Redis): Persistent, distributed cache for longer-term storage
  3. Intelligent Promotion: Automatically promote L2 hits to L1
  4. Smart Invalidation: Cache invalidation based on data changes
  5. TTL Management: Different TTLs based on data type

Consequences

Positive

  • Performance: Significant reduction in response times
  • Cost Savings: Fewer LLM API calls
  • Scalability: Better resource utilization
  • User Experience: Faster responses for repeated queries
  • Reliability: Graceful degradation when caches fail

Negative

  • Complexity: Additional caching logic to maintain
  • Memory Usage: L1 cache consumes application memory
  • Stale Data: Risk of serving stale data if not invalidated properly
  • Infrastructure: Requires Redis deployment and maintenance

Implementation

class CacheManager:
    def __init__(self, l1_backend: CacheBackend, l2_backend: Optional[CacheBackend] = None):
        self.l1 = l1_backend  # Fast memory cache
        self.l2 = l2_backend  # Redis cache
        
    async def get(self, key: str) -> Optional[Any]:
        # Try L1 first
        value = await self.l1.get(key)
        if value is not None:
            return value
            
        # Try L2 and promote to L1
        if self.l2:
            value = await self.l2.get(key)
            if value is not None:
                await self.l1.set(key, value, ttl=l1_ttl)
                return value

Cache decorators for automatic caching:

@cached(ttl=300, key_prefix="analysis:")
async def analyze_biomarkers(biomarkers: Dict[str, float]):
    # Expensive analysis logic
    pass

Notes

  • L1 cache has a maximum size with LRU eviction
  • L2 cache persists across application restarts
  • Cache keys include version numbers for easy invalidation
  • Monitoring tracks hit rates and performance metrics
  • Cache warming strategies for frequently accessed data