keerthanas1011 commited on
Commit
913cb3a
Β·
1 Parent(s): 6f02d8d

comply inference client with submission guidelines

Browse files
Files changed (2) hide show
  1. README.md +103 -29
  2. inference.py +37 -10
README.md CHANGED
@@ -1,17 +1,3 @@
1
- ---
2
- title: API Contract Debugger
3
- emoji: πŸ”
4
- colorFrom: blue
5
- colorTo: indigo
6
- sdk: docker
7
- app_port: 7860
8
- tags:
9
- - openenv
10
- - rl-environment
11
- - api-debugging
12
- - contract-testing
13
- ---
14
-
15
  # API Contract Debugger β€” OpenEnv Environment
16
 
17
  An OpenEnv environment where AI agents debug broken OpenAPI-style contract
@@ -129,15 +115,31 @@ Final episode score is computed by `grade_episode()` β†’ float in `[0.0, 1.0]`.
129
 
130
  ## Setup & Usage
131
 
132
- ### Run locally
133
 
134
  ```bash
 
135
  git clone <your-repo-url>
136
- cd api_contract_debugger_env
 
 
 
 
 
 
 
137
  pip install -r requirements.txt
138
- uvicorn server.app:app --host 0.0.0.0 --port 7860
139
  ```
140
 
 
 
 
 
 
 
 
 
 
141
  ### Run with Docker
142
 
143
  ```bash
@@ -145,19 +147,75 @@ docker build -t api-contract-debugger .
145
  docker run -p 7860:7860 api-contract-debugger
146
  ```
147
 
 
 
 
 
 
 
 
 
 
 
148
  ### Run the baseline agent
149
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
  ```bash
151
- export HF_TOKEN=your_token
152
- export ENV_BASE_URL=http://localhost:7860
153
  python inference.py
154
  ```
155
 
156
- ### Run tests
 
 
 
 
 
 
 
157
 
158
  ```bash
159
- pip install pytest httpx
160
- pytest tests/ -v
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
  ```
162
 
163
  ---
@@ -175,7 +233,7 @@ pytest tests/ -v
175
  ## Project Structure
176
 
177
  ```
178
- api_contract_debugger_env/
179
  β”œβ”€β”€ server/
180
  β”‚ β”œβ”€β”€ __init__.py
181
  β”‚ β”œβ”€β”€ app.py # FastAPI app, route registration
@@ -184,11 +242,27 @@ api_contract_debugger_env/
184
  β”‚ β”œβ”€β”€ graders.py # Violation detection + reward shaping
185
  β”‚ └── fixtures.py # Task definitions (broken + golden specs)
186
  β”œβ”€β”€ tests/
187
- β”‚ └── test_env.py # 56 tests covering all components
188
- β”œβ”€β”€ inference.py # Baseline agent
189
  β”œβ”€β”€ openenv.yaml # OpenEnv metadata
190
- β”œβ”€β”€ pyproject.toml # Package config + server entry point
191
- β”œβ”€β”€ requirements.txt
192
- β”œβ”€β”€ uv.lock
193
- └── Dockerfile
194
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # API Contract Debugger β€” OpenEnv Environment
2
 
3
  An OpenEnv environment where AI agents debug broken OpenAPI-style contract
 
115
 
116
  ## Setup & Usage
117
 
118
+ ### Installation
119
 
120
  ```bash
121
+ # Clone the repository
122
  git clone <your-repo-url>
123
+ cd api-contract-debugger
124
+
125
+ # Create virtual environment
126
+ python3 -m venv .venv
127
+ source .venv/bin/activate
128
+
129
+ # Install dependencies
130
+ pip install --upgrade pip
131
  pip install -r requirements.txt
 
132
  ```
133
 
134
+ ### Run locally
135
+
136
+ ```bash
137
+ # Start the server
138
+ uvicorn server.app:app --host 0.0.0.0 --port 7860 --reload
139
+ ```
140
+
141
+ The server will be available at `http://localhost:7860`
142
+
143
  ### Run with Docker
144
 
145
  ```bash
 
147
  docker run -p 7860:7860 api-contract-debugger
148
  ```
149
 
150
+ ### Run tests
151
+
152
+ ```bash
153
+ # Run entire test suite (56 tests)
154
+ pytest tests/ -v
155
+
156
+ # Run with coverage
157
+ pytest tests/ -v --cov=server
158
+ ```
159
+
160
  ### Run the baseline agent
161
 
162
+ The baseline agent uses an LLM (via OpenAI client) to propose fixes.
163
+
164
+ **Required environment variables** (must be set):
165
+ ```bash
166
+ export HF_TOKEN="your_huggingface_api_token" # Get from huggingface.co/settings/tokens
167
+ export ENV_BASE_URL="http://localhost:7860" # Environment server URL
168
+ export TASK_NAME="all" # "easy", "medium", "hard", or "all"
169
+ ```
170
+
171
+ **Optional environment variables** (have defaults):
172
+ ```bash
173
+ export API_BASE_URL="https://router.huggingface.co/v1" # LLM endpoint
174
+ export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct" # Model identifier
175
+ export LOCAL_IMAGE_NAME="optional_docker_image" # For docker image initialization
176
+ ```
177
+
178
+ Then run the agent:
179
  ```bash
 
 
180
  python inference.py
181
  ```
182
 
183
+ **Example output:**
184
+ ```
185
+ [START] task=easy env=api_contract_debugger model=Qwen/Qwen2.5-72B-Instruct
186
+ [STEP] step=1 action={"kind":"add_field",...} reward=0.70 done=true error=null
187
+ [END] success=true steps=1 score=1.000 rewards=0.70
188
+ ```
189
+
190
+ ### Test individual endpoints
191
 
192
  ```bash
193
+ # Health check
194
+ curl http://localhost:7860/health
195
+
196
+ # List available tasks
197
+ curl http://localhost:7860/tasks
198
+
199
+ # Reset to a task
200
+ curl -X POST http://localhost:7860/reset \
201
+ -H "Content-Type: application/json" \
202
+ -d '{"task_name":"easy"}'
203
+
204
+ # Apply an action
205
+ curl -X POST http://localhost:7860/step \
206
+ -H "Content-Type: application/json" \
207
+ -d '{
208
+ "action": {
209
+ "kind": "add_field",
210
+ "endpoint_index": 0,
211
+ "location": "response_body",
212
+ "field_name": "created_at",
213
+ "new_value": {"type": "string", "description": "ISO-8601 timestamp"}
214
+ }
215
+ }'
216
+
217
+ # Get final score
218
+ curl http://localhost:7860/score
219
  ```
220
 
221
  ---
 
233
  ## Project Structure
234
 
235
  ```
236
+ api-contract-debugger/
237
  β”œβ”€β”€ server/
238
  β”‚ β”œβ”€β”€ __init__.py
239
  β”‚ β”œβ”€β”€ app.py # FastAPI app, route registration
 
242
  β”‚ β”œβ”€β”€ graders.py # Violation detection + reward shaping
243
  β”‚ └── fixtures.py # Task definitions (broken + golden specs)
244
  β”œβ”€β”€ tests/
245
+ β”‚ └── test_env.py # 56 unit tests covering all components
246
+ β”œβ”€β”€ inference.py # Baseline LLM-powered agent
247
  β”œβ”€β”€ openenv.yaml # OpenEnv metadata
248
+ β”œβ”€β”€ pyproject.toml # Package configuration
249
+ β”œβ”€β”€ requirements.txt # Python dependencies
250
+ β”œβ”€β”€ Dockerfile # Container image configuration
251
+ └── RL_ARCHITECTURE.md # Complete RL framework documentation
252
  ```
253
+
254
+ ---
255
+
256
+ ## Documentation
257
+
258
+ ### RL_ARCHITECTURE.md
259
+ Comprehensive guide to the reinforcement learning implementation:
260
+ - **Agent** β€” How external AI systems interact with the environment via HTTP API
261
+ - **Environment** β€” Core `APIContractDebuggerEnv` class and episode lifecycle
262
+ - **State** β€” Observation space and full internal state representation
263
+ - **Action** β€” All 5 action types with validation rules and examples
264
+ - **Reward & Scoring** β€” Dense per-step rewards and episode grading formula
265
+ - **Complete example episode transcript** with JSON payloads
266
+ - **Python agent pseudocode** for custom implementations
267
+
268
+ ---
inference.py CHANGED
@@ -1,15 +1,23 @@
1
  """
2
  Baseline Inference Script β€” API Contract Debugger
3
  ===================================================
4
- Runs a GPT model against all three tasks and emits the required
5
  [START] / [STEP] / [END] log format.
6
 
7
- Environment variables:
8
- API_BASE_URL LLM endpoint (default: https://router.huggingface.co/v1)
9
- MODEL_NAME Model ID (default: Qwen/Qwen2.5-72B-Instruct)
10
- HF_TOKEN API key
11
- ENV_BASE_URL Running env (default: http://localhost:7860)
12
- TASK_NAME One task or "all" (default: all)
 
 
 
 
 
 
 
 
13
  """
14
 
15
  from __future__ import annotations
@@ -26,11 +34,30 @@ from openai import OpenAI
26
  # Configuration
27
  # ---------------------------------------------------------------------------
28
 
 
29
  API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
30
  MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
31
- API_KEY = os.getenv("HF_TOKEN") or os.getenv("API_KEY", "hf_placeholder")
32
- ENV_BASE_URL = os.getenv("ENV_BASE_URL", "http://localhost:7860").rstrip("/")
33
- TASK_NAME = os.getenv("TASK_NAME", "all")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  TEMPERATURE = 0.0
36
  MAX_TOKENS = 512
 
1
  """
2
  Baseline Inference Script β€” API Contract Debugger
3
  ===================================================
4
+ Runs an LLM model against API contract debugging tasks and emits the required
5
  [START] / [STEP] / [END] log format.
6
 
7
+ MANDATORY ENVIRONMENT VARIABLES:
8
+ HF_TOKEN or API_KEY Your API key for LLM access (REQUIRED - no default)
9
+ ENV_BASE_URL Base URL of the environment server (REQUIRED - no default)
10
+ TASK_NAME Task(s) to run: "easy", "medium", "hard", or "all" (REQUIRED - no default)
11
+
12
+ OPTIONAL ENVIRONMENT VARIABLES (with defaults):
13
+ API_BASE_URL LLM endpoint (default: https://router.huggingface.co/v1)
14
+ MODEL_NAME Model ID (default: Qwen/Qwen2.5-72B-Instruct)
15
+ LOCAL_IMAGE_NAME Docker image name (if using from_docker_image())
16
+
17
+ Output Format:
18
+ [START] task=<task_name> env=<benchmark> model=<model_name>
19
+ [STEP] step=<n> action=<action_str> reward=<0.00> done=<true|false> error=<msg|null>
20
+ [END] success=<true|false> steps=<n> score=<0.000> rewards=<r1,r2,...,rn>
21
  """
22
 
23
  from __future__ import annotations
 
34
  # Configuration
35
  # ---------------------------------------------------------------------------
36
 
37
+ # REQUIRED: Set defaults ONLY for API_BASE_URL and MODEL_NAME
38
  API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
39
  MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
40
+
41
+ # REQUIRED: HF_TOKEN for API authentication (no default)
42
+ API_KEY = os.getenv("HF_TOKEN") or os.getenv("API_KEY")
43
+ if not API_KEY:
44
+ raise ValueError(
45
+ "API key must be provided via HF_TOKEN or API_KEY environment variable"
46
+ )
47
+
48
+ # REQUIRED: LOCAL_IMAGE_NAME for docker image initialization (if used)
49
+ LOCAL_IMAGE_NAME = os.getenv("LOCAL_IMAGE_NAME")
50
+
51
+ # REQUIRED: Environment server URL (no default)
52
+ ENV_BASE_URL = os.getenv("ENV_BASE_URL")
53
+ if not ENV_BASE_URL:
54
+ raise ValueError("ENV_BASE_URL environment variable must be set")
55
+ ENV_BASE_URL = ENV_BASE_URL.rstrip("/")
56
+
57
+ # REQUIRED: Task name(s) to run (no default)
58
+ TASK_NAME = os.getenv("TASK_NAME")
59
+ if not TASK_NAME:
60
+ raise ValueError("TASK_NAME environment variable must be set")
61
 
62
  TEMPERATURE = 0.0
63
  MAX_TOKENS = 512