pragunk commited on
Commit
91d9e5e
·
verified ·
1 Parent(s): ec6e336

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -66,17 +66,21 @@ The environment features three programmatic workloads (tasks) designed to challe
66
 
67
  ## 📊 Baseline Comparisons
68
 
69
- To demonstrate the necessity of intelligent eviction policies, this environment provides benchmark scores comparing traditional operating system algorithms against a zero-shot LLM baseline (Llama-3 8B). The table below displays the final **Hit Rate (0.0 to 1.0)**.
70
 
71
- | Task (Workload) | Random Eviction | LRU | LFU | LLM Agent (Zero-Shot) |
72
- | :--- | :--- | :--- | :--- | :--- |
73
- | **Easy (Zipfian)** | 0.64 | 0.18 | 0.44 | **0.67** |
74
- | **Medium (Sequential)** | 0.35 | 0.00 | 0.08 | **0.16** |
75
- | **Hard (Shifting)** | **0.35** | 0.04 | 0.13 | 0.12 |
 
 
76
 
77
  **Key Insights for Researchers:**
78
- * **The Sequential Trap:** As proven by the Medium task, standard LRU algorithms achieve a mathematical **0.00 hit rate** when faced with sequence loops larger than the cache size. The LLM demonstrates foundational reasoning to break this loop, outperforming both LRU and LFU.
79
- * **The Shifting Challenge:** The Hard task proves that static frequency counters (LFU) and smaller zero-shot LLMs both struggle to adapt to sudden data shifts. This sets a clear, rigorous benchmark for future Reinforcement Learning agents to conquer.
 
 
80
 
81
  ---
82
 
@@ -97,9 +101,7 @@ uv sync
97
 
98
  ```bash
99
  #create .env file in root directory
100
- LLM_API_KEY="model api key"
101
- LLM_BASE_URL="model api url"
102
- LLM_MODEL_NAME="model name"
103
  ```
104
 
105
  ### 2. Running the Inference Agent
 
66
 
67
  ## 📊 Baseline Comparisons
68
 
69
+ To demonstrate the necessity of intelligent eviction policies, this environment provides benchmark scores comparing traditional operating system algorithms against various iterations of an LLM agent (Llama-3 8B). The table below displays the final **Hit Rate (0.0 to 1.0)**.
70
 
71
+ | Task (Workload) | Random | LRU | LFU | LLM (Zero-Shot) | LLM (Memory, No CoT) | LLM (Memory + CoT) |
72
+ | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
73
+ | **Easy (Zipfian)** | 0.64 | 0.18 | 0.44 | **0.67** | 0.43 | 0.53 |
74
+ | **Medium (Sequential)** | **0.35** | 0.00 | 0.08 | 0.16 | 0.06 | 0.29 |
75
+ | **Hard (Shifting)** | **0.35** | 0.04 | 0.13 | 0.12 | 0.08 | 0.16 |
76
+
77
+ *Note: While Random Eviction occasionally scores artificially high through pure statistical variance, it is non-deterministic and mathematically unsafe for production systems.*
78
 
79
  **Key Insights for Researchers:**
80
+ * **The Sequential Trap (LRU Failure):** As proven by the Medium task, standard LRU algorithms achieve a mathematical **0.00 hit rate** when faced with sequence loops larger than the cache size.
81
+ * **The Danger of Context Overload:** When the LLM was initially given a 15-step memory window without a reasoning space (`Memory, No CoT`), its performance *dropped* across all tasks. The model became overwhelmed by the dense history block, blinding it to immediate cache states.
82
+ * **The Power of Chain-of-Thought (CoT):** By forcing the agent to output a JSON `"reasoning"` string prior to selecting an eviction index, the model gained the computational processing space needed to analyze its own memory. This single architectural change nearly quintupled its performance on the Medium task (0.06 → 0.29) and doubled its performance on the Hard task (0.08 → 0.16), proving the agent successfully learned to "pin" items to break loops and proactively flush obsolete data during phase shifts.
83
+ * **The Parameter Bottleneck:** While the 8B parameter model successfully proves the agentic memory architecture works, the absolute scores indicate that smaller models struggle to flawlessly execute complex heuristics like Belady's MIN. This environment sets a rigorous, ready-made benchmark for Reinforcement Learning models and 70B+ reasoning models to conquer.
84
 
85
  ---
86
 
 
101
 
102
  ```bash
103
  #create .env file in root directory
104
+ HF_TOKEN="you api key"
 
 
105
  ```
106
 
107
  ### 2. Running the Inference Agent