Commit History

Update env/environment.py
2acdf7a
verified

Soham105 commited on

Update training/colab_notebook.py
e5aa2dc
verified

junaid0600 commited on

Update env/environment.py
cf6d807
verified

junaid0600 commited on

Update training/train_agent.py
44e9354
verified

junaid0600 commited on

Update training/train_agent.py
86abfc1
verified

junaid0600 commited on

Update training/train_agent.py
987f2db
verified

junaid0600 commited on

Upload reward_curve.png
a7802a8
verified

junaid0600 commited on

Add Colab training notebook link
9b80a84

junaid0600 commited on

changes evalute_agent
15e9605

junaid0600 commited on

updated readme
b96d42b

junaid0600 commited on

Reward curve: strategic +36.7pts vs random +0.0pts
842e560

junaid0600 commited on

updated requirements
e4126f3

junaid0600 commited on

updated readme
f004baa

junaid0600 commited on

prproject.toml and readme updated
809345d

junaid0600 commited on

Final Round 2: all checks passing, openenv validate OK
f30d05a

junaid0600 commited on

Fix gitignore - exclude pycache
a28e8c9

junaid0600 commited on

Force add all env, dataset, api files - gitignore fix
399f4c5

junaid0600 commited on

Add db_simulator and scenario files
ff10d5b

junaid0600 commited on

Round 2: SQL Database Engineer Agent - 24/24 tests passing
8cb206e

junaid0600 commited on

Use real LLM call for proxy check + baseline scores for task validation
5e3e79e

junaid0600 commited on

Use real LLM calls through API_BASE_URL proxy
d8cba4f

junaid0600 commited on

Clean inference.py using baseline scores strictly between 0 and 1
b02ec3c

junaid0600 commited on

Normalize all rewards to strictly (0.001, 0.999) range in step()
42a1cbd

junaid0600 commited on

Fix GraderResponse schema example score from 1 to 0.75
1a89fae

junaid0600 commited on

Clamp all reward scores strictly between 0.001 and 0.999
ef20791

junaid0600 commited on

Clamp grader scores strictly between 0.001 and 0.999 in endpoint and model
f2d88cb

junaid0600 commited on

Fix rewards never exactly 0.0 or 1.0 using proper normalization
7dff36b

junaid0600 commited on

Clamp all step rewards strictly between 0.001 and 0.999
11dd1d6

junaid0600 commited on

Fix score - shift rewards to positive range, never 0.0 or 1.0
888871f

junaid0600 commited on

Ensure score never below 0.1 to fix out of range error
e15627e

junaid0600 commited on

Fix score strictly between 0.001 and 0.999 - never 0.0 or 1.0
6e703c0

junaid0600 commited on

corrected everytihng
2146d9e

junaid0600 commited on

again fixed graders
d4b572f

junaid0600 commited on

again changed
95c7542

junaid0600 commited on

fixed new error
0ac8fe8

junaid0600 commited on