File size: 5,887 Bytes
0d1ca1e 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 3740ee3 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 0753958 9d51e1b 885a7c2 0753958 885a7c2 0753958 885a7c2 0753958 9d51e1b 0753958 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | ---
title: Cdn Cache Optimizer
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
tags:
- openenv
---
# π CDN Cache Optimizer β OpenEnv RL Environment
An RL environment simulating **edge CDN cache management** β the exact problem companies like Meta solve at planetary scale. An agent manages a cache of limited size, deciding which files to evict when new content arrives, balancing **hit rate**, **bandwidth efficiency**, and **thrash avoidance**.
---
## π― Motivation
Content Delivery Networks serve billions of files daily. Edge servers have limited storage, so they must constantly decide: *which cached files to keep, and which to evict?* Standard algorithms like LRU aren't optimal β especially when traffic has **viral bursts** (a file suddenly gets 50x more requests for 20 minutes, then drops back to zero).
A smarter agent can:
- Predict viral spikes from queue previews
- Avoid evicting high-frequency files
- Prevent cache thrashing (evicting then immediately re-requesting)
- Maximize bandwidth saved for users
---
## π§ Environment Description
At each step, a file is requested from the network. If it's already in the cache β **cache hit** (reward). If not β **cache miss**, and the agent must decide whether to evict an existing file to make room.
### Traffic Model
- **Steady files**: Consistent, cyclical demand
- **Viral files**: Bell-curve spike in popularity, then fade back to baseline
---
## π Action & Observation Space
### Observation Space
| Field | Type | Description |
|-------|------|-------------|
| `step` | int | Current episode step |
| `cache_used_mb` | float | MB currently used |
| `cache_capacity_mb` | float | Total cache size |
| `cache_fill_ratio` | float | 0.0β1.0 fill level |
| `cached_files` | List[FileEntry] | All files in cache with metadata |
| `incoming_file_id` | str | File being requested |
| `incoming_file_size_mb` | float | Size of incoming file |
| `incoming_file_is_viral` | bool | Is this file currently viral? |
| `cache_hit` | bool | Is incoming file already cached? |
| `recent_hit_rate` | float | Rolling hit rate (last 20 steps) |
| `time_of_day` | float | Normalized 0.0β1.0 daily cycle |
| `queue_preview` | List[str] | Next 3 file IDs (prefetch hint) |
### FileEntry Fields
| Field | Type | Description |
|-------|------|-------------|
| `file_id` | str | Unique identifier |
| `size_mb` | float | File size in MB |
| `request_frequency` | float | Requests since cached |
| `is_viral` | bool | Currently viral |
| `last_accessed` | int | Step number of last access |
### Action Space
| Field | Type | Description |
|-------|------|-------------|
| `evict_file_id` | str \| null | File to evict (null = no eviction) |
### Reward Function
| Component | Range | Description |
|-----------|-------|-------------|
| `cache_hit_bonus` | +1.0 to +1.5 | Hit reward (viral hits = +1.5) |
| `bandwidth_saved` | +0.0 to +0.2 | Reward for bandwidth efficiency |
| `eviction_penalty` | -0.0 to -0.5 | Penalty for evicting popular files |
| `thrash_penalty` | 0.0 or -0.5 | Penalty for evicting same file twice |
| `wasted_capacity_penalty` | -0.0 to -0.3 | Penalty for leaving cache empty |
---
## π Tasks
### Task 1: Steady Traffic Cache (Easy)
- **Cache**: 100MB | **Files**: 30 | **Steps**: 100
- No viral files β steady demand only
- Agent learns basic LRU-style eviction
- **Target hit rate**: β₯ 0.60 β score 1.0
- **Baseline score**: ~0.75
### Task 2: Mixed Traffic Cache (Medium)
- **Cache**: 80MB | **Files**: 50 | **Steps**: 150
- 20% viral files mixed with steady demand
- Agent must handle spikes and prioritize popular content
- **Score**: 70% hit rate + 30% bandwidth
- **Baseline score**: ~0.60
### Task 3: Constrained Cache with Viral Bursts (Hard)
- **Cache**: 50MB | **Files**: 80 | **Steps**: 200
- 35% viral files, tight capacity, large file sizes
- Agent must predict spikes, avoid thrashing
- **Score**: 50% hit rate + 25% bandwidth + 25% reward quality
- **Baseline score**: ~0.45
## Code Repository
Full source: https://github.com/umar-sharif821/cdn-cache-env
## Files Included
- **env/cache.py** - DriftCDNEnv environment implementation
- **server/app.py** - OpenEnv FastAPI server
- **training/train.py** - Fine-tuning script
- **training_results_finetuned.png** - Training results chart
- **baseline_drift.png** - Baseline comparison chart
---
## π Setup & Usage
### Local Setup
```bash
git clone <repo>
cd cdn-cache-env
pip install -r requirements.txt
```
### Run API Server
```bash
uvicorn api.main:app --host 0.0.0.0 --port 7860
```
### Run Inference (Baseline Agent)
```bash
export API_BASE_URL="https://api.openai.com/v1"
export MODEL_NAME="gpt-4o-mini"
export HF_TOKEN="your_token_here"
python inference.py
```
### Docker
```bash
docker build -t cdn-cache-env .
docker run -p 7860:7860 \
-e API_BASE_URL="https://api.openai.com/v1" \
-e MODEL_NAME="gpt-4o-mini" \
-e HF_TOKEN="your_token" \
cdn-cache-env
```
---
## π API Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/health` | Health check (returns 200) |
| GET | `/tasks` | List all tasks |
| POST | `/reset` | Start episode `{"task_id": "task_easy", "seed": 42}` |
| POST | `/step` | Take action `{"evict_file_id": "file_001" or null}` |
| GET | `/state` | Full environment state |
---
## π Baseline Scores
Using the built-in `smart_policy` (non-LLM baseline):
| Task | Hit Rate | Score |
|------|----------|-------|
| Easy | ~0.72 | ~1.00 |
| Medium | ~0.61 | ~0.82 |
| Hard | ~0.48 | ~0.78 |
| **Overall** | | **~0.87** |
---
## π Log Format
`inference.py` emits structured JSON logs:
```
{"type": "START", "task_id": "task_easy", ...}
{"type": "STEP", "step": 0, "action": {...}, "reward": 1.0, ...}
{"type": "END", "total_reward": 87.3, "final_hit_rate": 0.72, "score": 1.0}
``` |