Spaces:

hiitsesh
/

New_gpu_space

Sleeping

App Files Files Community

New_gpu_space

Commit History

blog: reword env reward and HOME wording; clarify worker roles (no simulator/synthetic phrasing).

6025ea6

hiitsesh commited on Apr 26

Update blog.md (rubric crosswalk); add outputs/agent_project_knowledge.json for reviewers.

0bae8be

hiitsesh commited on Apr 26

README: link blog.md as Hub long-form writeup (replace TODO).

f6ee013

hiitsesh commited on Apr 26

Add blog.md (ReleaseOps build log and technical context).

be2f36c

hiitsesh commited on Apr 26

README + API: add YouTube pitch link (Shorts)

d1f4854

hiitsesh commited on Apr 26

Add demo flow HTML for /ui/flow

7c0b714

hiitsesh commited on Apr 26

Add GRPO walkthrough notebook to Hub; link from README and / API

24a3140

hiitsesh commited on Apr 26

README: GitHub raw image URLs (Hub blocks plain PNG git); gitignore images/*.png

60c96ed

hiitsesh commited on Apr 26

Set HF_HOME and HF_DATASETS_CACHE for writable caches on Space

2954c1d

hiitsesh commited on Apr 26

Set TORCHINDUCTOR cache dirs before torch import (HF Space no passwd uids)

20b13c9

hiitsesh commited on Apr 26

Work around torch mega-cache double registration on Space imports

6f0a93d

hiitsesh commited on Apr 26

fix: durable outputs under /data, unbuffered training logs, relaxed deps

1348e4a

hiitsesh commited on Apr 26

fix: Update eval API to support subfolder and fix default model ID

3bfb797

hiitsesh commited on Apr 26

feat: Expose eval API endpoint

0f3f490

hiitsesh commited on Apr 26

Enhance TorchGenerator to support optional subfolder for model loading

8c33b70

hiitsesh commited on Apr 26

Post-training push to Hugging Face Hub: --hub-model-repo and pilot query params

ebcefc4

hiitsesh commited on Apr 25

write_test: explain repo vs container; add download_url and list_files

6787f1b

hiitsesh commited on Apr 25

Add /outputs/write_test to create simpllll.csv and report hf_token env presence

ff9e20d

hiitsesh commited on Apr 25

Document HF token via Space secrets, ENV.example, and operational curl for pilot and push_to_hub

1737549

hiitsesh commited on Apr 25

Add 8-bit Adam + gradient checkpointing for bf16 training to fit 1.7B on L4

a5ea00e

hiitsesh commited on Apr 25

Add --bf16 flag and expose it via /train/pilot for bigger models

6addc33

hiitsesh commited on Apr 25

Allow model_name override on /train/pilot and raise num_generations cap

b05b4e3

hiitsesh commited on Apr 25

Force Qwen3 chat template to disable thinking mode via tokenizer patch

95fada9

hiitsesh commited on Apr 25

Fix reward_func reading wrong env layer for tool_calls counters

156f912

hiitsesh commited on Apr 25

Nuclear reward shaping: silent rollouts get flat -3, active rollouts floor at +1

31334fa

hiitsesh commited on Apr 25

Crush-signal reward shaping to force exploration over silence

101b660

hiitsesh commited on Apr 25

Fix reward collapse: floor exploration loss, bonus per-call and per-terminal

7078c21

hiitsesh commited on Apr 25

Live training visualization + aggressive reward shaping to prevent 'do nothing' collapse

4dde8b9

hiitsesh commited on Apr 25

Add compact training summary endpoint

cbf9895

hiitsesh commited on Apr 25

Use Qwen3 model for TRL OpenEnv tool parsing

cb4128a

hiitsesh commited on Apr 25

Add jmespath for TRL tool parsing

6bcc57d

hiitsesh commited on Apr 25

Adapt GRPO config to installed TRL API

9aacd47

hiitsesh commited on Apr 25

Add browser-accessible GRPO training endpoints

89e87ef

hiitsesh commited on Apr 25

Add Space root health landing route

2a40222

hiitsesh commited on Apr 25

Install git for Space dependency builds

a6c5637

hiitsesh commited on Apr 25

Add Hugging Face Space metadata

53ea410

hiitsesh commited on Apr 25

Deploy ReleaseOps Arena GPU Space

9aa9d87

hiitsesh commited on Apr 25

initial commit

c0618ea
verified

hiitsesh commited on Apr 25

Commit History

blog: reword env reward and HOME wording; clarify worker roles (no simulator/synthetic phrasing). 6025ea6

Update blog.md (rubric crosswalk); add outputs/agent_project_knowledge.json for reviewers. 0bae8be

README: link blog.md as Hub long-form writeup (replace TODO). f6ee013

Add blog.md (ReleaseOps build log and technical context). be2f36c

README + API: add YouTube pitch link (Shorts) d1f4854

Add demo flow HTML for /ui/flow 7c0b714

Add GRPO walkthrough notebook to Hub; link from README and / API 24a3140

README: GitHub raw image URLs (Hub blocks plain PNG git); gitignore images/*.png 60c96ed

Set HF_HOME and HF_DATASETS_CACHE for writable caches on Space 2954c1d

Set TORCHINDUCTOR cache dirs before torch import (HF Space no passwd uids) 20b13c9

Work around torch mega-cache double registration on Space imports 6f0a93d

fix: durable outputs under /data, unbuffered training logs, relaxed deps 1348e4a

fix: Update eval API to support subfolder and fix default model ID 3bfb797

feat: Expose eval API endpoint 0f3f490

Enhance TorchGenerator to support optional subfolder for model loading 8c33b70

Post-training push to Hugging Face Hub: --hub-model-repo and pilot query params ebcefc4

write_test: explain repo vs container; add download_url and list_files 6787f1b

Add /outputs/write_test to create simpllll.csv and report hf_token env presence ff9e20d

Document HF token via Space secrets, ENV.example, and operational curl for pilot and push_to_hub 1737549

Add 8-bit Adam + gradient checkpointing for bf16 training to fit 1.7B on L4 a5ea00e

Add --bf16 flag and expose it via /train/pilot for bigger models 6addc33

Allow model_name override on /train/pilot and raise num_generations cap b05b4e3

Force Qwen3 chat template to disable thinking mode via tokenizer patch 95fada9

Fix reward_func reading wrong env layer for tool_calls counters 156f912

Nuclear reward shaping: silent rollouts get flat -3, active rollouts floor at +1 31334fa

Crush-signal reward shaping to force exploration over silence 101b660

Fix reward collapse: floor exploration loss, bonus per-call and per-terminal 7078c21

Live training visualization + aggressive reward shaping to prevent 'do nothing' collapse 4dde8b9

Add compact training summary endpoint cbf9895

Use Qwen3 model for TRL OpenEnv tool parsing cb4128a

Add jmespath for TRL tool parsing 6bcc57d

Adapt GRPO config to installed TRL API 9aacd47

Add browser-accessible GRPO training endpoints 89e87ef

Add Space root health landing route 2a40222

Install git for Space dependency builds a6c5637

Add Hugging Face Space metadata 53ea410

Deploy ReleaseOps Arena GPU Space 9aa9d87

initial commit c0618ea verified

blog: reword env reward and HOME wording; clarify worker roles (no simulator/synthetic phrasing).

6025ea6

Update blog.md (rubric crosswalk); add outputs/agent_project_knowledge.json for reviewers.

0bae8be

README: link blog.md as Hub long-form writeup (replace TODO).

f6ee013

Add blog.md (ReleaseOps build log and technical context).

be2f36c

README + API: add YouTube pitch link (Shorts)

d1f4854

Add demo flow HTML for /ui/flow

7c0b714

Add GRPO walkthrough notebook to Hub; link from README and / API

24a3140

README: GitHub raw image URLs (Hub blocks plain PNG git); gitignore images/*.png

60c96ed

Set HF_HOME and HF_DATASETS_CACHE for writable caches on Space

2954c1d

Set TORCHINDUCTOR cache dirs before torch import (HF Space no passwd uids)

20b13c9

Work around torch mega-cache double registration on Space imports

6f0a93d

fix: durable outputs under /data, unbuffered training logs, relaxed deps

1348e4a

fix: Update eval API to support subfolder and fix default model ID

3bfb797

feat: Expose eval API endpoint

0f3f490

Enhance TorchGenerator to support optional subfolder for model loading

8c33b70

Post-training push to Hugging Face Hub: --hub-model-repo and pilot query params

ebcefc4

write_test: explain repo vs container; add download_url and list_files

6787f1b

Add /outputs/write_test to create simpllll.csv and report hf_token env presence

ff9e20d

Document HF token via Space secrets, ENV.example, and operational curl for pilot and push_to_hub

1737549

Add 8-bit Adam + gradient checkpointing for bf16 training to fit 1.7B on L4

a5ea00e

Add --bf16 flag and expose it via /train/pilot for bigger models

6addc33

Allow model_name override on /train/pilot and raise num_generations cap

b05b4e3

Force Qwen3 chat template to disable thinking mode via tokenizer patch

95fada9

Fix reward_func reading wrong env layer for tool_calls counters

156f912

Nuclear reward shaping: silent rollouts get flat -3, active rollouts floor at +1

31334fa

Crush-signal reward shaping to force exploration over silence

101b660

Fix reward collapse: floor exploration loss, bonus per-call and per-terminal

7078c21

Live training visualization + aggressive reward shaping to prevent 'do nothing' collapse

4dde8b9

Add compact training summary endpoint

cbf9895

Use Qwen3 model for TRL OpenEnv tool parsing

cb4128a

Add jmespath for TRL tool parsing

6bcc57d

Adapt GRPO config to installed TRL API

9aacd47

Add browser-accessible GRPO training endpoints

89e87ef

Add Space root health landing route

2a40222

Install git for Space dependency builds

a6c5637

Add Hugging Face Space metadata

53ea410

Deploy ReleaseOps Arena GPU Space

9aa9d87

initial commit

c0618ea
verified