Spaces:

T0X1N
/

Agentic-RagBot

Sleeping

App Files Files Community

Agentic-RagBot / docs /adr /004-redis-caching-strategy.md

MediGuard AI

feat: Initial release of MediGuard AI v2.0

c4f5f25 2 months ago

preview code

raw

history blame contribute delete

2.55 kB

	# ADR-004: Redis Multi-Level Caching Strategy

	## Status
	Accepted

	## Context
	MediGuard AI performs many expensive operations:
	- LLM API calls for analysis
	- Vector searches in OpenSearch
	- Complex biomarker calculations
	- Repeated requests for similar data

	Without caching, these operations would be repeated unnecessarily, leading to:
	- Increased latency for users
	- Higher costs from LLM API calls
	- Unnecessary load on databases
	- Poor user experience

	## Decision
	Implement a multi-level caching strategy using Redis:
	1. L1 Cache (Memory): Fast, temporary cache for frequently accessed data
	2. L2 Cache (Redis): Persistent, distributed cache for longer-term storage
	3. Intelligent Promotion: Automatically promote L2 hits to L1
	4. Smart Invalidation: Cache invalidation based on data changes
	5. TTL Management: Different TTLs based on data type

	## Consequences

	### Positive
	- Performance: Significant reduction in response times
	- Cost Savings: Fewer LLM API calls
	- Scalability: Better resource utilization
	- User Experience: Faster responses for repeated queries
	- Reliability: Graceful degradation when caches fail

	### Negative
	- Complexity: Additional caching logic to maintain
	- Memory Usage: L1 cache consumes application memory
	- Stale Data: Risk of serving stale data if not invalidated properly
	- Infrastructure: Requires Redis deployment and maintenance

	## Implementation
	```python
	class CacheManager:
	def __init__(self, l1_backend: CacheBackend, l2_backend: Optional[CacheBackend] = None):
	self.l1 = l1_backend # Fast memory cache
	self.l2 = l2_backend # Redis cache

	async def get(self, key: str) -> Optional[Any]:
	# Try L1 first
	value = await self.l1.get(key)
	if value is not None:
	return value

	# Try L2 and promote to L1
	if self.l2:
	value = await self.l2.get(key)
	if value is not None:
	await self.l1.set(key, value, ttl=l1_ttl)
	return value
	```

	Cache decorators for automatic caching:
	```python
	@cached(ttl=300, key_prefix="analysis:")
	async def analyze_biomarkers(biomarkers: Dict[str, float]):
	# Expensive analysis logic
	pass
	```

	## Notes
	- L1 cache has a maximum size with LRU eviction
	- L2 cache persists across application restarts
	- Cache keys include version numbers for easy invalidation
	- Monitoring tracks hit rates and performance metrics
	- Cache warming strategies for frequently accessed data