File size: 8,086 Bytes
3b8bf40
 
5954205
3b8bf40
5954205
3b8bf40
5954205
 
3b8bf40
5954205
3b8bf40
5954205
0fd10c5
5954205
 
 
 
 
 
0fd10c5
5954205
3b8bf40
5954205
3b8bf40
5954205
 
 
 
 
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
3b8bf40
5954205
 
 
 
3b8bf40
5954205
3b8bf40
5954205
3b8bf40
5954205
0fd10c5
5954205
0fd10c5
5954205
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
 
 
 
 
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
 
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
0fd10c5
5954205
0fd10c5
5954205
 
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
 
 
 
 
 
 
 
3b8bf40
5954205
3b8bf40
5954205
 
 
 
3b8bf40
5954205
3b8bf40
5954205
3b8bf40
5954205
 
 
 
 
3b8bf40
5954205
0fd10c5
5954205
 
 
 
0fd10c5
5954205
3b8bf40
5954205
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# Project Status

This is the canonical repo status file.

It should answer two questions quickly:

1. what the project can do right now
2. what actually changed during the recent benchmark-upgrade thread

## Current Snapshot

As of April 8, 2026:

- the active branch is `main`
- the last runtime-changing benchmark checkpoint before this cleanup pass was `1d9d3ee`
- the latest runtime-changing checkpoint passed `openenv validate`
- the latest full test checkpoint passed `175` tests
- the environment now behaves like a real queue-management benchmark, not a single-ticket classifier
- stale review branches and nonessential planning docs have been removed so the repo stays submission-clean

## What The Project Does Today

The current repo supports:

- full routing on all three tasks: `issue_type`, `priority`, `assignment_group`, and `resolution_action`
- partial observability that gets harder as the task difficulty rises
- five action types: `submit`, `investigate`, `request_info`, `defer`, and `open_incident`
- queue-level carry-over state such as capacity pressure, incident slots, SLA risk, and deferred tickets
- cluster-aware episodes where one ticket can make later related tickets easier or harder
- deterministic follow-up tickets when earlier handling was weak or incomplete
- a terminal score that blends routing quality with queue-management quality
- a local policy-learning loop that compares and searches over deterministic policies
- a modern landing page at `/web` instead of the original plain HTML table

## Validation State

The latest validated runtime state before this cleanup pass included:

- passing `openenv validate`
- passing full `python -m unittest discover -s tests -p "test_*.py" -v`
- a passing Hugging Face Space and Docker-ready packaging setup
- synchronized pushes to both `origin/main` and `space/main`

This cleanup pass is documentation and repo hygiene only. It does not change the environment contract.

## Full Commit Timeline From Git History

The entries below are taken directly from the local `main` history, which matches `origin/main`.

### March 31, 2026

- `10:47 IST` `3752981` `Initial commit`
- `11:20 IST` `eae2b1d` `March 30 - April 1st : sever/`
- `11:27 IST` `9e71ac4` `Merge pull request #2 from suyashkumar102/main`
- `13:29 IST` `61398c0` `April 2nd tasks`
- `20:28 IST` `7564d6c` `Fix dataset loader for UTF-8 BOM on Windows`

### April 1, 2026

- `18:28 IST` `4f3bed5` `fix openenv.yaml: use git URL for openenv-core dep, matches requirements.txt`
- `20:11 IST` `969eaef` `Merge pull request #3 from suyashkumar102/main`
- `20:50 IST` `3b8bf40` `Improve dataset realism and consolidate project status log`
- `20:59 IST` `1b9e464` `Update docs after first runtime validation pass`

### April 2, 2026

- `22:16 IST` `5b9f288` `fix: expand inference docstring and add git to Dockerfile`
- `22:18 IST` `5de9815` `add analysis folder`
- `22:39 IST` `9e384ef` `Merge pull request #4 from suyashkumar102/main`
- `23:37 IST` `6753cde` `Finish Roopal April 5-6 docs and repo audit`
- `23:40 IST` `c35bcc6` `Merge remote-tracking branch 'origin/main' into codex/apr5-apr6-roopal`

### April 3, 2026

- `00:50 IST` `c16104f` `Add GitHub Actions Docker smoke test`
- `00:55 IST` `54d32f8` `Merge pull request #5 from Roopalgn/codex/apr5-apr6-roopal`
- `01:19 IST` `7a88607` `Update final submission roadmap`
- `01:27 IST` `706f85f` `Merge branch 'codex/apr5-apr6-roopal'`
- `02:20 IST` `6f27f26` `Update final submission roadmap`
- `02:30 IST` `375aa81` `Update final submission roadmap`
- `11:47 IST` `ae36543` `Add grader and dataset unit tests with scoring contract`
- `12:59 IST` `72d2634` `Consolidate requirements docs and align roadmap with official submission rules`
- `18:19 IST` `6920aae` `Complete Roopal roadmap work for April 4-7`
- `20:36 IST` `795d5f1` `Update final submission roadmap`
- `21:44 IST` `82aca6e` `Make inference.py compliant with submission checklist`

### April 4, 2026

- `10:32 IST` `0fd10c5` `add smoke/integration tests, fix logging, openenvignore, status updates`
- `10:34 IST` `f57e6a7` `fix port 8000->7860 in app.py/openenv.yaml, add pyproject script entry, fix stubs`
- `10:35 IST` `fd636ad` `gitignore build/ and uv.lock`
- `10:41 IST` `ca7bdbd` `remove uv.lock from gitignore`
- `11:45 IST` `32f4c09` `fix inference stdout and README docker port`
- `11:50 IST` `3707fc3` `Merge pull request #6 from suyashkumar102/main`
- `12:12 IST` `5dd60ae` `uv.lock`
- `14:33 IST` `89ca22f` `Clean up internal docs and finalize validation state`

### April 5, 2026

- `20:53 IST` `42dd095` `feat: competitive upgrade for hackathon submission`
- `20:56 IST` `2a0f057` `docs: add deep competitive gap report and gap analysis`
- `22:22 IST` `6c5051f` `fix: resolve full test suite failures from PR review`

### April 6, 2026

- `12:42 IST` `c64d203` `Finalize gap fixes and lightweight competitive upgrades`
- `12:54 IST` `52ab5fa` `Merge branch 'main' into final-submit-gap-fixes`
- `13:34 IST` `186fd65` `Merge pull request #10 from suyashkumar102/final-submit-gap-fixes`
- `14:14 IST` `2216a4d` `Add root Dockerfile for Hugging Face Space`
- `17:09 IST` `8ccf96d` `Ignore action metadata in extra field validation`
- `21:15 IST` `67ce1eb` `Add policy learning loop and strengthen RL-style environment`

### April 7, 2026

- `11:37 IST` `8ada670` `Use evaluator API_KEY for LLM proxy and strengthen env`
- `12:15 IST` `2d5c8e6` `Pin python base image digest for stable Docker builds`
- `13:16 IST` `bfc789d` `Enable proxy LLM mode with API_KEY and real default model`
- `13:29 IST` `e3cd5c5` `Use AWS public ECR mirror for python base image`
- `13:57 IST` `ff634dc` `Run all tasks by default and keep task scores inside open interval`
- `14:09 IST` `e3dfee6` `Clamp grader task scores to open interval`
- `14:51 IST` `c0d489c` `Keep invalid-action task scores inside open interval`
- `15:07 IST` `a5859dc` `Normalize remaining score fields into open interval`
- `15:43 IST` `d6d9493` `Clamp reported task scores to open interval and match sample logs`
- `21:43 IST` `d378e5d` `Strengthen hard-task investigation and grading`

### April 8, 2026

- `03:59 IST` `8241eb5` `Add queue-planning helpdesk routing mechanics`
- `07:03 IST` `043d9e1` `Upgrade helpdesk env with queue dynamics and operational actions`
- `10:06 IST` `454cef3` `Add cluster-aware queue dynamics to helpdesk env`
- `11:45 IST` `1d9d3ee` `Strengthen queue benchmark and refresh landing page`

## Net Result Of The Thread

Compared with the starting point, the repo is now materially stronger in five ways:

- Phase 2 compliance issues were fixed without breaking the evaluator contract
- the benchmark became more agentic through queue mutation, operational actions, and downstream consequences
- the hard task stopped being a near-trivial keyword-routing problem
- the grader and final reward became more aligned with real queue-management quality
- the public presentation improved through cleaner docs and a better landing page

This cleanup and publishing pass also:

- expands `PROJECT_STATUS.md` to cover the full repo history instead of only the late-stage sprint
- rewrites `KNOWLEDGE.md` as a mentor-style guide for a beginner builder
- removes stale planning and internal analysis docs that no longer reflect the shipped benchmark
- leaves `required.md` as the retained requirements checklist

## Remaining Optional Gaps

The project is strong, but a few optional upgrades still exist if more time is ever available:

- replace more authored queue rules with even more emergent simulator dynamics
- grow the dataset further with less taxonomy-friendly wording
- move from policy search toward a more clearly trainable learning setup
- gather stronger benchmark comparisons against external LLM baselines

## Repo Hygiene Notes

This cleanup pass also keeps the repo focused by:

- retaining `required.md` as the requirement checklist
- keeping `README.md`, `KNOWLEDGE.md`, and `PROJECT_STATUS.md` as the main public guidance
- removing stale planning and gap-analysis files that no longer reflect the current state