Spaces:
Sleeping
Sleeping
| title: Cdn Cache Optimizer | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| pinned: false | |
| tags: | |
| - openenv | |
| # π CDN Cache Optimizer β OpenEnv RL Environment | |
| An RL environment simulating **edge CDN cache management** β the exact problem companies like Meta solve at planetary scale. An agent manages a cache of limited size, deciding which files to evict when new content arrives, balancing **hit rate**, **bandwidth efficiency**, and **thrash avoidance**. | |
| --- | |
| ## π― Motivation | |
| Content Delivery Networks serve billions of files daily. Edge servers have limited storage, so they must constantly decide: *which cached files to keep, and which to evict?* Standard algorithms like LRU aren't optimal β especially when traffic has **viral bursts** (a file suddenly gets 50x more requests for 20 minutes, then drops back to zero). | |
| A smarter agent can: | |
| - Predict viral spikes from queue previews | |
| - Avoid evicting high-frequency files | |
| - Prevent cache thrashing (evicting then immediately re-requesting) | |
| - Maximize bandwidth saved for users | |
| --- | |
| ## π§ Environment Description | |
| At each step, a file is requested from the network. If it's already in the cache β **cache hit** (reward). If not β **cache miss**, and the agent must decide whether to evict an existing file to make room. | |
| ### Traffic Model | |
| - **Steady files**: Consistent, cyclical demand | |
| - **Viral files**: Bell-curve spike in popularity, then fade back to baseline | |
| --- | |
| ## π Action & Observation Space | |
| ### Observation Space | |
| | Field | Type | Description | | |
| |-------|------|-------------| | |
| | `step` | int | Current episode step | | |
| | `cache_used_mb` | float | MB currently used | | |
| | `cache_capacity_mb` | float | Total cache size | | |
| | `cache_fill_ratio` | float | 0.0β1.0 fill level | | |
| | `cached_files` | List[FileEntry] | All files in cache with metadata | | |
| | `incoming_file_id` | str | File being requested | | |
| | `incoming_file_size_mb` | float | Size of incoming file | | |
| | `incoming_file_is_viral` | bool | Is this file currently viral? | | |
| | `cache_hit` | bool | Is incoming file already cached? | | |
| | `recent_hit_rate` | float | Rolling hit rate (last 20 steps) | | |
| | `time_of_day` | float | Normalized 0.0β1.0 daily cycle | | |
| | `queue_preview` | List[str] | Next 3 file IDs (prefetch hint) | | |
| ### FileEntry Fields | |
| | Field | Type | Description | | |
| |-------|------|-------------| | |
| | `file_id` | str | Unique identifier | | |
| | `size_mb` | float | File size in MB | | |
| | `request_frequency` | float | Requests since cached | | |
| | `is_viral` | bool | Currently viral | | |
| | `last_accessed` | int | Step number of last access | | |
| ### Action Space | |
| | Field | Type | Description | | |
| |-------|------|-------------| | |
| | `evict_file_id` | str \| null | File to evict (null = no eviction) | | |
| ### Reward Function | |
| | Component | Range | Description | | |
| |-----------|-------|-------------| | |
| | `cache_hit_bonus` | +1.0 to +1.5 | Hit reward (viral hits = +1.5) | | |
| | `bandwidth_saved` | +0.0 to +0.2 | Reward for bandwidth efficiency | | |
| | `eviction_penalty` | -0.0 to -0.5 | Penalty for evicting popular files | | |
| | `thrash_penalty` | 0.0 or -0.5 | Penalty for evicting same file twice | | |
| | `wasted_capacity_penalty` | -0.0 to -0.3 | Penalty for leaving cache empty | | |
| --- | |
| ## π Tasks | |
| ### Task 1: Steady Traffic Cache (Easy) | |
| - **Cache**: 100MB | **Files**: 30 | **Steps**: 100 | |
| - No viral files β steady demand only | |
| - Agent learns basic LRU-style eviction | |
| - **Target hit rate**: β₯ 0.60 β score 1.0 | |
| - **Baseline score**: ~0.75 | |
| ### Task 2: Mixed Traffic Cache (Medium) | |
| - **Cache**: 80MB | **Files**: 50 | **Steps**: 150 | |
| - 20% viral files mixed with steady demand | |
| - Agent must handle spikes and prioritize popular content | |
| - **Score**: 70% hit rate + 30% bandwidth | |
| - **Baseline score**: ~0.60 | |
| ### Task 3: Constrained Cache with Viral Bursts (Hard) | |
| - **Cache**: 50MB | **Files**: 80 | **Steps**: 200 | |
| - 35% viral files, tight capacity, large file sizes | |
| - Agent must predict spikes, avoid thrashing | |
| - **Score**: 50% hit rate + 25% bandwidth + 25% reward quality | |
| - **Baseline score**: ~0.45 | |
| --- | |
| ## π Setup & Usage | |
| ### Local Setup | |
| ```bash | |
| git clone <repo> | |
| cd cdn-cache-env | |
| pip install -r requirements.txt | |
| ``` | |
| ### Run API Server | |
| ```bash | |
| uvicorn api.main:app --host 0.0.0.0 --port 7860 | |
| ``` | |
| ### Run Inference (Baseline Agent) | |
| ```bash | |
| export API_BASE_URL="https://api.openai.com/v1" | |
| export MODEL_NAME="gpt-4o-mini" | |
| export HF_TOKEN="your_token_here" | |
| python inference.py | |
| ``` | |
| ### Docker | |
| ```bash | |
| docker build -t cdn-cache-env . | |
| docker run -p 7860:7860 \ | |
| -e API_BASE_URL="https://api.openai.com/v1" \ | |
| -e MODEL_NAME="gpt-4o-mini" \ | |
| -e HF_TOKEN="your_token" \ | |
| cdn-cache-env | |
| ``` | |
| --- | |
| ## π API Endpoints | |
| | Method | Endpoint | Description | | |
| |--------|----------|-------------| | |
| | GET | `/health` | Health check (returns 200) | | |
| | GET | `/tasks` | List all tasks | | |
| | POST | `/reset` | Start episode `{"task_id": "task_easy", "seed": 42}` | | |
| | POST | `/step` | Take action `{"evict_file_id": "file_001" or null}` | | |
| | GET | `/state` | Full environment state | | |
| --- | |
| ## π Baseline Scores | |
| Using the built-in `smart_policy` (non-LLM baseline): | |
| | Task | Hit Rate | Score | | |
| |------|----------|-------| | |
| | Easy | ~0.72 | ~1.00 | | |
| | Medium | ~0.61 | ~0.82 | | |
| | Hard | ~0.48 | ~0.78 | | |
| | **Overall** | | **~0.87** | | |
| --- | |
| ## π Log Format | |
| `inference.py` emits structured JSON logs: | |
| ``` | |
| {"type": "START", "task_id": "task_easy", ...} | |
| {"type": "STEP", "step": 0, "action": {...}, "reward": 1.0, ...} | |
| {"type": "END", "total_reward": 87.3, "final_hit_rate": 0.72, "score": 1.0} | |
| ``` |