Commit History

fix: replace deprecated Pydantic class Config with model_config = ConfigDict()
23ec28c

Siteshcodes commited on

v2.0 frontend: multi-step investigation UI with step tracker, progressive reveal, and reasoning bonus
8483903

Siteshcodes commited on

v2.0: multi-step episodes, procedural bugs, semantic grading, sessions, 71 tests
703aa57

Siteshcodes commited on

feat: serve frontend at root / so it shows in HF Spaces App tab, JSON status moved to /health
ca5a648

Siteshcodes commited on

feat: add interactive demo frontend at /web — no existing endpoints changed
787a5a5

Siteshcodes commited on

docs: comprehensive README with spec compliance checklist, log format, API examples
47bc4be

Siteshcodes commited on

fix: add name/difficulty to tasks, per-task [START]/[END] logs for validator
7eb0325

Siteshcodes commited on

fix: stateful endpoints + score clamping for validator pass
6174aa3

Siteshcodes commited on

fix: all 0.0 defaults replaced with 0.05
b4a6378

Siteshcodes commited on

fix: last remaining 0.0 in reset() total_score
8b89d05

Siteshcodes commited on

fix: remove all 0.0/1.0 references, update reward ranges throughout
e44a740

Siteshcodes commited on

fix: reward_range to (0.05, 0.95) in environment.py TASKS_META
6a0c34c

Siteshcodes commited on

fix: reward_range changed to (0.05, 0.95) to satisfy strict bounds
19cec11

Siteshcodes commited on

fix: replace 0.0 fallback with 0.05 in graders to satisfy strict range
bc79ac5

Siteshcodes commited on

fix: correct [START] format, HF_TOKEN priority, remove /v1 forcing
e8f71a0

Siteshcodes commited on

fix: proper openenv.yaml with spec_version and task graders
f1611d0

Siteshcodes commited on

fix: no exact 0.0 or 1.0 anywhere in rewards
2fbe4d0

Siteshcodes commited on

fix: reward_range 0.05-0.95 and proper descriptions
926a06f

Siteshcodes commited on

fix: simplify openenv.yaml to match passing format
ae363e1

Siteshcodes commited on

fix: correct grader import paths
a1396d9

Siteshcodes commited on

fix: correct grader paths for validator
f4c456d

Siteshcodes commited on

fix: add pydantic dependency for grader import
ea7259f

Siteshcodes commited on

final fix: correct grader import to task module
4c4a2a5

Siteshcodes commited on

fix: correct grader import path
04f5331

Siteshcodes commited on

fix: tasks returns plain array for validator
89bfee5

Siteshcodes commited on

fix: replace asdict with model_dump in client.py step method
2d3fe3f

Siteshcodes commited on

fix: validator anchor lines and reset keyword arg
baa461d

Siteshcodes commited on

fix: use pydantic .dict() instead of dataclasses asdict, remove raise exc
ff292ff

Siteshcodes commited on

fix: inline BugTriageClient to eliminate import conflict
20f2ce3

Siteshcodes commited on

fix: use bug_triage_client to avoid openenv-core client conflict
78e2c60

Siteshcodes commited on

fix: pass task_id as positional arg to reset()
31a803a

Siteshcodes commited on

fix: add openai to requirements - was causing silent import failure
86c4bbe

Siteshcodes commited on

fix: /v1 base_url, verbose logging, raise on exception
cc1335d

Siteshcodes commited on

remove silent fallback in call_model
e48bc80

Siteshcodes commited on

use API_KEY env var from eval proxy
ff6527e

Siteshcodes commited on

use injected API_BASE_URL from eval proxy
ab6f40e

Siteshcodes commited on

fix grader import paths in openenv.yaml
3fbd399

Siteshcodes commited on

update README with final scores and structure
6b95692

Siteshcodes commited on

remove duplicate task.py from root
9f2d535

Siteshcodes commited on

fix: full grader paths
cfd14d7

Siteshcodes commited on