Commit History

blog: reword env reward and HOME wording; clarify worker roles (no simulator/synthetic phrasing).
6025ea6

hiitsesh commited on

Update blog.md (rubric crosswalk); add outputs/agent_project_knowledge.json for reviewers.
0bae8be

hiitsesh commited on

README: link blog.md as Hub long-form writeup (replace TODO).
f6ee013

hiitsesh commited on

Add blog.md (ReleaseOps build log and technical context).
be2f36c

hiitsesh commited on

README + API: add YouTube pitch link (Shorts)
d1f4854

hiitsesh commited on

Add demo flow HTML for /ui/flow
7c0b714

hiitsesh commited on

Add GRPO walkthrough notebook to Hub; link from README and / API
24a3140

hiitsesh commited on

README: GitHub raw image URLs (Hub blocks plain PNG git); gitignore images/*.png
60c96ed

hiitsesh commited on

Set HF_HOME and HF_DATASETS_CACHE for writable caches on Space
2954c1d

hiitsesh commited on

Set TORCHINDUCTOR cache dirs before torch import (HF Space no passwd uids)
20b13c9

hiitsesh commited on

Work around torch mega-cache double registration on Space imports
6f0a93d

hiitsesh commited on

fix: durable outputs under /data, unbuffered training logs, relaxed deps
1348e4a

hiitsesh commited on

fix: Update eval API to support subfolder and fix default model ID
3bfb797

hiitsesh commited on

feat: Expose eval API endpoint
0f3f490

hiitsesh commited on

Enhance TorchGenerator to support optional subfolder for model loading
8c33b70

hiitsesh commited on

Post-training push to Hugging Face Hub: --hub-model-repo and pilot query params
ebcefc4

hiitsesh commited on

write_test: explain repo vs container; add download_url and list_files
6787f1b

hiitsesh commited on

Add /outputs/write_test to create simpllll.csv and report hf_token env presence
ff9e20d

hiitsesh commited on

Document HF token via Space secrets, ENV.example, and operational curl for pilot and push_to_hub
1737549

hiitsesh commited on

Add 8-bit Adam + gradient checkpointing for bf16 training to fit 1.7B on L4
a5ea00e

hiitsesh commited on

Add --bf16 flag and expose it via /train/pilot for bigger models
6addc33

hiitsesh commited on

Allow model_name override on /train/pilot and raise num_generations cap
b05b4e3

hiitsesh commited on

Force Qwen3 chat template to disable thinking mode via tokenizer patch
95fada9

hiitsesh commited on

Fix reward_func reading wrong env layer for tool_calls counters
156f912

hiitsesh commited on

Nuclear reward shaping: silent rollouts get flat -3, active rollouts floor at +1
31334fa

hiitsesh commited on

Crush-signal reward shaping to force exploration over silence
101b660

hiitsesh commited on

Fix reward collapse: floor exploration loss, bonus per-call and per-terminal
7078c21

hiitsesh commited on

Live training visualization + aggressive reward shaping to prevent 'do nothing' collapse
4dde8b9

hiitsesh commited on

Add compact training summary endpoint
cbf9895

hiitsesh commited on

Use Qwen3 model for TRL OpenEnv tool parsing
cb4128a

hiitsesh commited on

Add jmespath for TRL tool parsing
6bcc57d

hiitsesh commited on

Adapt GRPO config to installed TRL API
9aacd47

hiitsesh commited on

Add browser-accessible GRPO training endpoints
89e87ef

hiitsesh commited on

Add Space root health landing route
2a40222

hiitsesh commited on

Install git for Space dependency builds
a6c5637

hiitsesh commited on

Add Hugging Face Space metadata
53ea410

hiitsesh commited on

Deploy ReleaseOps Arena GPU Space
9aa9d87

hiitsesh commited on

initial commit
c0618ea
verified

hiitsesh commited on