File size: 5,887 Bytes

---
title: Cdn Cache Optimizer
emoji: 🌐
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
tags:
  - openenv
---

# 🌐 CDN Cache Optimizer — OpenEnv RL Environment

An RL environment simulating **edge CDN cache management** — the exact problem companies like Meta solve at planetary scale. An agent manages a cache of limited size, deciding which files to evict when new content arrives, balancing **hit rate**, **bandwidth efficiency**, and **thrash avoidance**.

---

## 🎯 Motivation

Content Delivery Networks serve billions of files daily. Edge servers have limited storage, so they must constantly decide: *which cached files to keep, and which to evict?* Standard algorithms like LRU aren't optimal — especially when traffic has **viral bursts** (a file suddenly gets 50x more requests for 20 minutes, then drops back to zero).

A smarter agent can:
- Predict viral spikes from queue previews
- Avoid evicting high-frequency files
- Prevent cache thrashing (evicting then immediately re-requesting)
- Maximize bandwidth saved for users

---

## 🔧 Environment Description

At each step, a file is requested from the network. If it's already in the cache → **cache hit** (reward). If not → **cache miss**, and the agent must decide whether to evict an existing file to make room.

### Traffic Model
- **Steady files**: Consistent, cyclical demand
- **Viral files**: Bell-curve spike in popularity, then fade back to baseline

---

## 📐 Action & Observation Space

### Observation Space
| Field | Type | Description |
|-------|------|-------------|
| `step` | int | Current episode step |
| `cache_used_mb` | float | MB currently used |
| `cache_capacity_mb` | float | Total cache size |
| `cache_fill_ratio` | float | 0.0–1.0 fill level |
| `cached_files` | List[FileEntry] | All files in cache with metadata |
| `incoming_file_id` | str | File being requested |
| `incoming_file_size_mb` | float | Size of incoming file |
| `incoming_file_is_viral` | bool | Is this file currently viral? |
| `cache_hit` | bool | Is incoming file already cached? |
| `recent_hit_rate` | float | Rolling hit rate (last 20 steps) |
| `time_of_day` | float | Normalized 0.0–1.0 daily cycle |
| `queue_preview` | List[str] | Next 3 file IDs (prefetch hint) |

### FileEntry Fields
| Field | Type | Description |
|-------|------|-------------|
| `file_id` | str | Unique identifier |
| `size_mb` | float | File size in MB |
| `request_frequency` | float | Requests since cached |
| `is_viral` | bool | Currently viral |
| `last_accessed` | int | Step number of last access |

### Action Space
| Field | Type | Description |
|-------|------|-------------|
| `evict_file_id` | str \| null | File to evict (null = no eviction) |

### Reward Function
| Component | Range | Description |
|-----------|-------|-------------|
| `cache_hit_bonus` | +1.0 to +1.5 | Hit reward (viral hits = +1.5) |
| `bandwidth_saved` | +0.0 to +0.2 | Reward for bandwidth efficiency |
| `eviction_penalty` | -0.0 to -0.5 | Penalty for evicting popular files |
| `thrash_penalty` | 0.0 or -0.5 | Penalty for evicting same file twice |
| `wasted_capacity_penalty` | -0.0 to -0.3 | Penalty for leaving cache empty |

---

## 📋 Tasks

### Task 1: Steady Traffic Cache (Easy)
- **Cache**: 100MB | **Files**: 30 | **Steps**: 100
- No viral files — steady demand only
- Agent learns basic LRU-style eviction
- **Target hit rate**: ≥ 0.60 → score 1.0
- **Baseline score**: ~0.75

### Task 2: Mixed Traffic Cache (Medium)
- **Cache**: 80MB | **Files**: 50 | **Steps**: 150  
- 20% viral files mixed with steady demand
- Agent must handle spikes and prioritize popular content
- **Score**: 70% hit rate + 30% bandwidth
- **Baseline score**: ~0.60

### Task 3: Constrained Cache with Viral Bursts (Hard)
- **Cache**: 50MB | **Files**: 80 | **Steps**: 200
- 35% viral files, tight capacity, large file sizes
- Agent must predict spikes, avoid thrashing
- **Score**: 50% hit rate + 25% bandwidth + 25% reward quality
- **Baseline score**: ~0.45

## Code Repository

Full source: https://github.com/umar-sharif821/cdn-cache-env

## Files Included

- **env/cache.py** - DriftCDNEnv environment implementation
- **server/app.py** - OpenEnv FastAPI server
- **training/train.py** - Fine-tuning script
- **training_results_finetuned.png** - Training results chart
- **baseline_drift.png** - Baseline comparison chart
---

## 🚀 Setup & Usage

### Local Setup
```bash
git clone <repo>
cd cdn-cache-env
pip install -r requirements.txt
```

### Run API Server
```bash
uvicorn api.main:app --host 0.0.0.0 --port 7860
```

### Run Inference (Baseline Agent)
```bash
export API_BASE_URL="https://api.openai.com/v1"
export MODEL_NAME="gpt-4o-mini"
export HF_TOKEN="your_token_here"

python inference.py
```

### Docker
```bash
docker build -t cdn-cache-env .
docker run -p 7860:7860 \
  -e API_BASE_URL="https://api.openai.com/v1" \
  -e MODEL_NAME="gpt-4o-mini" \
  -e HF_TOKEN="your_token" \
  cdn-cache-env
```

---

## 🌐 API Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/health` | Health check (returns 200) |
| GET | `/tasks` | List all tasks |
| POST | `/reset` | Start episode `{"task_id": "task_easy", "seed": 42}` |
| POST | `/step` | Take action `{"evict_file_id": "file_001" or null}` |
| GET | `/state` | Full environment state |

---

## 📊 Baseline Scores

Using the built-in `smart_policy` (non-LLM baseline):

| Task | Hit Rate | Score |
|------|----------|-------|
| Easy | ~0.72 | ~1.00 |
| Medium | ~0.61 | ~0.82 |
| Hard | ~0.48 | ~0.78 |
| **Overall** | | **~0.87** |

---

## 📝 Log Format

`inference.py` emits structured JSON logs:

```
{"type": "START", "task_id": "task_easy", ...}
{"type": "STEP",  "step": 0, "action": {...}, "reward": 1.0, ...}
{"type": "END",   "total_reward": 87.3, "final_hit_rate": 0.72, "score": 1.0}
```