Spaces:

saravanatanjiro
/

Openenv

Paused

App Files Files Community

Openenv / cloud_arena

Commit History

Fix torch bfloat16 errors on T4 GPUs by enabling Unsloth dtype auto-detection and explicitly wrapping forward passes in autocast

4b22b06

saravanatanjiro commited on Apr 26

Add mandatory Unsloth inference state toggles around generation for RL pipeline

81ed883

saravanatanjiro commited on Apr 26

Capture and display exact Unsloth import exception

fbf8187

saravanatanjiro commited on Apr 26

Update with existing environment

d81b76a

saravanatanjiro commited on Apr 26

Fix GRPO group-loss training and align UI defaults.

5a0c6af

saravanatanjiro commited on Apr 26

Migrate LLM pipeline to custom GRPO with robust rewards

dfc5996

saravanatanjiro commited on Apr 26

Multi-model benchmark pipeline: VRAM cleanup + EMA graph + detailed output

af6bbef

kavin57447 commited on Apr 25

Fix truncation: 80 tokens, regex safety net, strict prompt

deef82c

kavin57447 commited on Apr 25

Hackathon speedrun: max_new_tokens=32, seq_len=512 for 4-8x faster iterations

ee5ddee

kavin57447 commited on Apr 25

Replace flash-attn with PyTorch built-in SDPA (no CUDA compile needed)

e9dea07

kavin57447 commited on Apr 25

Max GPU utilization: flash-attn2 + grad accumulation + 15 steps/ep + 1024 seq len

93d0171

kavin57447 commited on Apr 25

Switch to Llama 3.1 8B + fix low-timestep crash (min 5000)

8d95050

kavin57447 commited on Apr 25

Add LLM RL training with Gemma 7B + LoRA

ee3dfa7

kavin57447 commited on Apr 25

Add Cloud Arena Mathematical Model RL environment

12263fa

kavin57447 commited on Apr 25