| --- |
| title: Cdn Cache Optimizer |
| emoji: π |
| colorFrom: blue |
| colorTo: green |
| sdk: docker |
| pinned: false |
| tags: |
| - openenv |
| --- |
| |
| # π CDN Cache Optimizer β OpenEnv RL Environment |
|
|
| An RL environment simulating **edge CDN cache management** β the exact problem companies like Meta solve at planetary scale. An agent manages a cache of limited size, deciding which files to evict when new content arrives, balancing **hit rate**, **bandwidth efficiency**, and **thrash avoidance**. |
|
|
| --- |
|
|
| ## π― Motivation |
|
|
| Content Delivery Networks serve billions of files daily. Edge servers have limited storage, so they must constantly decide: *which cached files to keep, and which to evict?* Standard algorithms like LRU aren't optimal β especially when traffic has **viral bursts** (a file suddenly gets 50x more requests for 20 minutes, then drops back to zero). |
|
|
| A smarter agent can: |
| - Predict viral spikes from queue previews |
| - Avoid evicting high-frequency files |
| - Prevent cache thrashing (evicting then immediately re-requesting) |
| - Maximize bandwidth saved for users |
|
|
| --- |
|
|
| ## π§ Environment Description |
|
|
| At each step, a file is requested from the network. If it's already in the cache β **cache hit** (reward). If not β **cache miss**, and the agent must decide whether to evict an existing file to make room. |
|
|
| ### Traffic Model |
| - **Steady files**: Consistent, cyclical demand |
| - **Viral files**: Bell-curve spike in popularity, then fade back to baseline |
|
|
| --- |
|
|
| ## π Action & Observation Space |
|
|
| ### Observation Space |
| | Field | Type | Description | |
| |-------|------|-------------| |
| | `step` | int | Current episode step | |
| | `cache_used_mb` | float | MB currently used | |
| | `cache_capacity_mb` | float | Total cache size | |
| | `cache_fill_ratio` | float | 0.0β1.0 fill level | |
| | `cached_files` | List[FileEntry] | All files in cache with metadata | |
| | `incoming_file_id` | str | File being requested | |
| | `incoming_file_size_mb` | float | Size of incoming file | |
| | `incoming_file_is_viral` | bool | Is this file currently viral? | |
| | `cache_hit` | bool | Is incoming file already cached? | |
| | `recent_hit_rate` | float | Rolling hit rate (last 20 steps) | |
| | `time_of_day` | float | Normalized 0.0β1.0 daily cycle | |
| | `queue_preview` | List[str] | Next 3 file IDs (prefetch hint) | |
|
|
| ### FileEntry Fields |
| | Field | Type | Description | |
| |-------|------|-------------| |
| | `file_id` | str | Unique identifier | |
| | `size_mb` | float | File size in MB | |
| | `request_frequency` | float | Requests since cached | |
| | `is_viral` | bool | Currently viral | |
| | `last_accessed` | int | Step number of last access | |
|
|
| ### Action Space |
| | Field | Type | Description | |
| |-------|------|-------------| |
| | `evict_file_id` | str \| null | File to evict (null = no eviction) | |
|
|
| ### Reward Function |
| | Component | Range | Description | |
| |-----------|-------|-------------| |
| | `cache_hit_bonus` | +1.0 to +1.5 | Hit reward (viral hits = +1.5) | |
| | `bandwidth_saved` | +0.0 to +0.2 | Reward for bandwidth efficiency | |
| | `eviction_penalty` | -0.0 to -0.5 | Penalty for evicting popular files | |
| | `thrash_penalty` | 0.0 or -0.5 | Penalty for evicting same file twice | |
| | `wasted_capacity_penalty` | -0.0 to -0.3 | Penalty for leaving cache empty | |
|
|
| --- |
|
|
| ## π Tasks |
|
|
| ### Task 1: Steady Traffic Cache (Easy) |
| - **Cache**: 100MB | **Files**: 30 | **Steps**: 100 |
| - No viral files β steady demand only |
| - Agent learns basic LRU-style eviction |
| - **Target hit rate**: β₯ 0.60 β score 1.0 |
| - **Baseline score**: ~0.75 |
|
|
| ### Task 2: Mixed Traffic Cache (Medium) |
| - **Cache**: 80MB | **Files**: 50 | **Steps**: 150 |
| - 20% viral files mixed with steady demand |
| - Agent must handle spikes and prioritize popular content |
| - **Score**: 70% hit rate + 30% bandwidth |
| - **Baseline score**: ~0.60 |
|
|
| ### Task 3: Constrained Cache with Viral Bursts (Hard) |
| - **Cache**: 50MB | **Files**: 80 | **Steps**: 200 |
| - 35% viral files, tight capacity, large file sizes |
| - Agent must predict spikes, avoid thrashing |
| - **Score**: 50% hit rate + 25% bandwidth + 25% reward quality |
| - **Baseline score**: ~0.45 |
|
|
| ## Code Repository |
|
|
| Full source: https://github.com/umar-sharif821/cdn-cache-env |
|
|
| ## Files Included |
|
|
| - **env/cache.py** - DriftCDNEnv environment implementation |
| - **server/app.py** - OpenEnv FastAPI server |
| - **training/train.py** - Fine-tuning script |
| - **training_results_finetuned.png** - Training results chart |
| - **baseline_drift.png** - Baseline comparison chart |
| --- |
| |
| ## π Setup & Usage |
| |
| ### Local Setup |
| ```bash |
| git clone <repo> |
| cd cdn-cache-env |
| pip install -r requirements.txt |
| ``` |
| |
| ### Run API Server |
| ```bash |
| uvicorn api.main:app --host 0.0.0.0 --port 7860 |
| ``` |
| |
| ### Run Inference (Baseline Agent) |
| ```bash |
| export API_BASE_URL="https://api.openai.com/v1" |
| export MODEL_NAME="gpt-4o-mini" |
| export HF_TOKEN="your_token_here" |
| |
| python inference.py |
| ``` |
| |
| ### Docker |
| ```bash |
| docker build -t cdn-cache-env . |
| docker run -p 7860:7860 \ |
| -e API_BASE_URL="https://api.openai.com/v1" \ |
| -e MODEL_NAME="gpt-4o-mini" \ |
| -e HF_TOKEN="your_token" \ |
| cdn-cache-env |
| ``` |
| |
| --- |
| |
| ## π API Endpoints |
| |
| | Method | Endpoint | Description | |
| |--------|----------|-------------| |
| | GET | `/health` | Health check (returns 200) | |
| | GET | `/tasks` | List all tasks | |
| | POST | `/reset` | Start episode `{"task_id": "task_easy", "seed": 42}` | |
| | POST | `/step` | Take action `{"evict_file_id": "file_001" or null}` | |
| | GET | `/state` | Full environment state | |
| |
| --- |
| |
| ## π Baseline Scores |
| |
| Using the built-in `smart_policy` (non-LLM baseline): |
| |
| | Task | Hit Rate | Score | |
| |------|----------|-------| |
| | Easy | ~0.72 | ~1.00 | |
| | Medium | ~0.61 | ~0.82 | |
| | Hard | ~0.48 | ~0.78 | |
| | **Overall** | | **~0.87** | |
| |
| --- |
|
|
| ## π Log Format |
|
|
| `inference.py` emits structured JSON logs: |
|
|
| ``` |
| {"type": "START", "task_id": "task_easy", ...} |
| {"type": "STEP", "step": 0, "action": {...}, "reward": 1.0, ...} |
| {"type": "END", "total_reward": 87.3, "final_hit_rate": 0.72, "score": 1.0} |
| ``` |