Spaces:

prithic07
/

context-prune

Sleeping

App Files Files Community

prithic07 commited on Apr 4

Commit

b308a54

1 Parent(s): 6cad4bb

Meta x Scaler FINAL AUDIT PASS: OpenEnv Spec 1.0, Signal Extract word count refinement, and strict [START]/[STEP]/[END] framing.

Browse files

Files changed (6) hide show

Dockerfile +11 -3
README.md +27 -42
context_pruning_env/utils.py +7 -7
inference.py +21 -20
openenv.yaml +2 -2
server/app.py +17 -0

Dockerfile CHANGED Viewed

@@ -1,14 +1,22 @@
-FROM python:3.11-slim
 # Set environment variables for performance and standard OpenEnv logging
 ENV PYTHONUNBUFFERED=1
 ENV PYTHONPATH=/app
 WORKDIR /app
-# Install system dependencies
 # Install dependencies
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
-CMD ["uvicorn", "context_pruning_env.server.app:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]

+# Use the official Python 3.10 base image
+FROM python:3.10-slim
 # Set environment variables for performance and standard OpenEnv logging
 ENV PYTHONUNBUFFERED=1
 ENV PYTHONPATH=/app
+ENV PORT=8000
 WORKDIR /app
 # Install dependencies
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
+# Copy all project files
+COPY . .
+# Expose the mandatory OpenEnv port
+EXPOSE 8000
+# Command to run the environment server (FastAPI)
+CMD ["python", "server/app.py"]

README.md CHANGED Viewed

@@ -1,17 +1,11 @@
 # ContextPrune: Adaptive Context Optimization Environment
-**ContextPrune** is a Meta x Scaler Hackathon compliant reinforcement learning environment designed for Phase 1: Automated Validation. It focuses on the critical task of context pruning for RAG pipelines, reducing noise and token counts while strictly preserving answer faithfulness.
 ---
-## 🌍 Environment Description
-ContextPrune implements the **OpenEnv Spec**, providing a standardized interface for RL agents to optimize retrieved contexts. The environment presents a query and multiple context chunks (from SQuAD or synthetic noise) where the agent must decide which chunks to keep and which to prune using a binary mask.
-### Resource Constraints
-- **vCPU**: 2
-- **RAM**: 8GB
-- **Runtime**: Python 3.10+
-- **Port**: 8000 (OpenEnv Server)
 ---
@@ -24,57 +18,48 @@ ContextPrune implements the **OpenEnv Spec**, providing a standardized interface
 ### Observation Space (ContextObservation)
 - **question**: The user query to be answered.
-- **chunks**: A list of text strings representing the retrieved context.
 - **initial_token_count**: The total token count before optimization.
 - **current_token_count**: Cumulative tokens of the currently selected chunks.
-- **task_name**: The identifier for the current pruning task.
 ---
-## 🏆 Task Descriptions
-| Task ID | Name | Difficulty | Scoring Logic |
-| :--- | :--- | :--- | :--- |
-| **01** | `noise_purge` | **Easy** | 0.0 or 1.0. Perfect score if all noise is deleted and the answer is kept. |
-| **02** | `dedupe_arena` | **Medium** | 1.0 if word count is reduced by >50% while preserving the answer. |
-| **03** | `signal_extract` | **Hard** | $1 - (FinalTokens/InitialTokens)$. Score scales with compression ratio. |
 ---
-## 📈 Reward Function (Trajectory Signals)
-The environment emits rewards based on the agent's efficiency and accuracy:
-- **Efficiency**: `+0.1` for every irrelevant chunk or duplicate correctly pruned.
-- **Accuracy**: `+0.7` bonus at the end of the trajectory if the "Gold Chunk" is preserved.
-- **Death Penalty**: `-1.0` and immediate `done=True` if the agent prunes the Gold Chunk (Information Loss).
 ---
-## 🛠️ Setup Instructions
-### 1. Local Development
-```bash
-# Install dependencies
-pip install -r requirements.txt
-# Configure API (Optional for testing)
-echo "GOOGLE_API_KEY=your_key" > .env
-# Run Inference Evaluation
-python inference.py
-```
-### 2. Docker Deployment
 ```bash
-# Build the standardized image
-docker build -t contextprune .
-# Start the environment server
-docker run -p 8000:8000 contextprune
 ```
-### 3. Inference Logging
-Mandatory logs are emitted in the following format for the Hackathon Evaluator:
-`task=<name> env=contextprune model=<model> step=<n> action=<str> reward=<0.00> done=<bool> score=<score> rewards=<r1,r2...>`
 ---
 *Built for the Meta x Scaler Hackathon 2026*

 # ContextPrune: Adaptive Context Optimization Environment
+**ContextPrune** is a specialized Reinforcement Learning (RL) environment designed to tackle **Attention Dilution** in large-scale RAG pipelines. It is fully compliant with the **Meta x Scaler Hackathon Round 1** specification.
 ---
+## 💡 Motivation: Attention Dilution
+In Retrieval-Augmented Generation (RAG), as context windows expand, LLMs often suffer from "Attention Dilution"—the inclusion of irrelevant or redundant information that distracts the model from the ground-truth signal. ContextPrune provides a training ground for agents to surgically remove noise and compress context, improving both accuracy and inference efficiency.
 ---
 ### Observation Space (ContextObservation)
 - **question**: The user query to be answered.
+- **chunks**: A list of text strings (exactly 5 for standard tasks, or variable for Signal Extract).
 - **initial_token_count**: The total token count before optimization.
 - **current_token_count**: Cumulative tokens of the currently selected chunks.
+- **task_name**: `noise_purge`, `dedupe_arena`, or `signal_extract`.
 ---
+## 🏆 Task Descriptions & Baseline Scores
+| Task ID | Name | Difficulty | Baseline Score | Objective |
+| :--- | :--- | :--- | :--- | :--- |
+| **01** | `noise_purge` | **Easy** | **1.00** | Prune 1 random garbage chunk + keep 1 Gold chunk. |
+| **02** | `dedupe_arena` | **Medium** | **1.00** | Resolve redundancy among 3 chunks (Jaccard > 0.8). |
+| **03** | `signal_extract` | **Hard** | **0.85+** | Extract signal from 2,000+ words of noise. |
 ---
+## 📊 Reward Engineering
+- **Partial Progress**: `+0.1` for every irrelevant/duplicate chunk correctly pruned.
+- **Final Accuracy**: `+0.7` bonus if the Gold chunk is preserved in the final state.
+- **Critical Failure**: `-1.0` penalty and immediate termination if the Gold chunk is pruned.
 ---
+## 🛠️ Infrastructure & Setup
+### Requirements
+- **vCPU**: 2 | **RAM**: 8GB
+- **Runtime**: Python 3.10
+- **Port**: 8000
+### Local Execution
 ```bash
+# Set your API Key
+export GEMINI_API_KEY=your_key_here
+# Run the mandatory inference script
+python inference.py
 ```
+### Validator Compliance
+Run `openenv validate` to ensure all 3/3 checks pass.
 ---
 *Built for the Meta x Scaler Hackathon 2026*

context_pruning_env/utils.py CHANGED Viewed

@@ -50,14 +50,14 @@ class SQuADLoader:
             chunks.append({"content": "Actually, " + gold_context, "is_gold": True, "is_duplicate": True})
         elif task_name == "signal_extract":
-            # Hard: 1 Long context (2,000+ words)
-            # We simulate this by taking 10 random SQuAD contexts and joining them.
-            # Only one contains the answer.
-            long_context_parts = []
-            long_context_parts.append(gold_context)
-            for _ in range(15): # ~15 chunks of ~150 words = ~2250 words
                 _, noise_entry = self._get_next_entry()
-                long_context_parts.append(noise_entry["context"])
             # Shuffling the parts so the gold one isn't first
             random.shuffle(long_context_parts)

             chunks.append({"content": "Actually, " + gold_context, "is_gold": True, "is_duplicate": True})
         elif task_name == "signal_extract":
+            # Hard: 1 Gold context + multiple noise (2,000+ words total)
+            long_context_parts = [gold_context]
+            current_words = len(gold_context.split())
+            while current_words < 2200: # Ensure 2,000+ words
                 _, noise_entry = self._get_next_entry()
+                content = noise_entry["context"]
+                long_context_parts.append(content)
+                current_words += len(content.split())
             # Shuffling the parts so the gold one isn't first
             random.shuffle(long_context_parts)

inference.py CHANGED Viewed

@@ -16,25 +16,25 @@ load_dotenv()
 # Mandatory Environment Variables
 API_BASE_URL = os.environ.get("API_BASE_URL", "https://generativelanguage.googleapis.com/v1beta/openai/")
-MODEL_NAME = os.environ.get("MODEL_NAME", "gemini-1.5-flash")
-HF_TOKEN = os.environ.get("HF_TOKEN", os.environ.get("GOOGLE_API_KEY", ""))
 def run_inference():
-    if not HF_TOKEN:
-        print("ERROR: HF_TOKEN (or GOOGLE_API_KEY) not found.")
         return
-    client = OpenAI(api_key=HF_TOKEN, base_url=API_BASE_URL)
     env = ContextPruningEnv()
     tasks = ["noise_purge", "dedupe_arena", "signal_extract"]
     for task in tasks:
-        # [START] tag for automated evaluation
-        print(f"[START] task={task} env=contextprune model={MODEL_NAME}")
         obs = env.reset(task_name=task)
         step_n = 1
         prompt = (
             f"Task: {task}\n"
@@ -67,26 +67,27 @@ def run_inference():
         action = ContextAction(mask=mask)
         final_obs = env.step(action)
-        # [STEP] tag for each action in the trajectory
         step_log = (
             f"[STEP] task={task} "
             f"step={step_n} "
             f"action={json.dumps(mask)} "
             f"reward={final_obs.reward:.2f} "
-            f"done={str(final_obs.done).lower()}"
-        )
-        print(step_log)
-        # [END] tag for episode completion
-        score = final_obs.metadata.get('eval_score', 0)
-        success = score > 0.5
-        end_log = (
-            f"[END] task={task} "
-            f"score={score:.2f} "
             f"success={str(success).lower()} "
             f"rewards={final_obs.reward:.2f}"
         )
-        print(end_log)
 if __name__ == "__main__":
     run_inference()

 # Mandatory Environment Variables
 API_BASE_URL = os.environ.get("API_BASE_URL", "https://generativelanguage.googleapis.com/v1beta/openai/")
+MODEL_NAME = os.environ.get("MODEL_NAME", "gemini-3-flash")
+GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY", os.environ.get("GOOGLE_API_KEY", ""))
 def run_inference():
+    if not GEMINI_API_KEY:
+        print("ERROR: GEMINI_API_KEY not found.")
         return
+    client = OpenAI(api_key=GEMINI_API_KEY, base_url=API_BASE_URL)
     env = ContextPruningEnv()
     tasks = ["noise_purge", "dedupe_arena", "signal_extract"]
     for task in tasks:
         obs = env.reset(task_name=task)
+        # [START] Framing for Automated Evaluator
+        print(f"[START] task={task} env=contextprune model={MODEL_NAME} step=0 action=null reward=0.0 done=false success=null score=0.0")
         step_n = 1
         prompt = (
             f"Task: {task}\n"
         action = ContextAction(mask=mask)
         final_obs = env.step(action)
+        # [STEP] Framing
+        score = final_obs.metadata.get('eval_score', 0)
+        success = score > 0.5
         step_log = (
             f"[STEP] task={task} "
+            f"env=contextprune "
+            f"model={MODEL_NAME} "
             f"step={step_n} "
             f"action={json.dumps(mask)} "
             f"reward={final_obs.reward:.2f} "
+            f"done={str(final_obs.done).lower()} "
+            f"error=null "
             f"success={str(success).lower()} "
+            f"steps={step_n} "
+            f"score={score:.2f} "
             f"rewards={final_obs.reward:.2f}"
         )
+        print(step_log)
+        # [END] Framing
+        print(f"[END] task={task} score={score:.2f} success={str(success).lower()}")
 if __name__ == "__main__":
     run_inference()

openenv.yaml CHANGED Viewed

@@ -2,8 +2,8 @@ spec_version: 1
 name: contextprune
 version: 0.1.0
 type: space
-runtime: python
-app: context_pruning_env.server.app:app
 port: 8000
 resources:
   cpu: 2

 name: contextprune
 version: 0.1.0
 type: space
+runtime: python:3.10-slim
+app: server/app.py
 port: 8000
 resources:
   cpu: 2

server/app.py ADDED Viewed

	@@ -0,0 +1,17 @@

+import os
+import uvicorn
+from openenv.core.env_server.http_server import create_fastapi_app
+from context_pruning_env.env import ContextPruningEnv
+# Initialize the Hackathon-compliant environment
+env = ContextPruningEnv()
+# Create the standard OpenEnv FastAPI app
+app = create_fastapi_app(env)
+def main() -> None:
+    port = int(os.environ.get("PORT", "8000"))
+    uvicorn.run(app, host="0.0.0.0", port=port)
+if __name__ == "__main__":
+    main()