Commit History

Fix torch bfloat16 errors on T4 GPUs by enabling Unsloth dtype auto-detection and explicitly wrapping forward passes in autocast
4b22b06

saravanatanjiro commited on

Set TRITON_CACHE_DIR to /tmp/triton_cache to avoid root permission denied error
5f168d6

saravanatanjiro commited on

Add mandatory Unsloth inference state toggles around generation for RL pipeline
81ed883

saravanatanjiro commited on

Pin pydantic, fastapi, and starlette to fix Gradio 4.x JSON schema and TemplateResponse bugs
56934e2

saravanatanjiro commited on

Pin Gradio to 4.36.1 to fix TypeError during json schema parsing on startup
477c526

saravanatanjiro commited on

Fix Gradio runtime error by moving theme to gr.Blocks
a593df9

saravanatanjiro commited on

Pin huggingface-hub to 0.24.7 to fix Unsloth _token import error
4dfbc48

saravanatanjiro commited on

Capture and display exact Unsloth import exception
fbf8187

saravanatanjiro commited on

Switch SDK to docker to use custom Dockerfile and fix pip build
b4a2158

saravanatanjiro commited on

Fix Gradio sdk_version to a valid fully-specified version (4.44.0)
07dcf6a

saravanatanjiro commited on

Add HuggingFace Space configuration reference to README
10062f6

saravanatanjiro commited on

Update with existing environment
d81b76a

saravanatanjiro commited on

Fix GRPO group-loss training and align UI defaults.
5a0c6af

saravanatanjiro commited on

Migrate LLM pipeline to custom GRPO with robust rewards
dfc5996

saravanatanjiro commited on

Multi-model benchmark pipeline: VRAM cleanup + EMA graph + detailed output
af6bbef

kavin57447 commited on

Fix truncation: 80 tokens, regex safety net, strict prompt
deef82c

kavin57447 commited on

Hackathon speedrun: max_new_tokens=32, seq_len=512 for 4-8x faster iterations
ee5ddee

kavin57447 commited on

Replace flash-attn with PyTorch built-in SDPA (no CUDA compile needed)
e9dea07

kavin57447 commited on

Fix: install torch before flash-attn (needs torch at build time)
332efeb

kavin57447 commited on

Max GPU utilization: flash-attn2 + grad accumulation + 15 steps/ep + 1024 seq len
93d0171

kavin57447 commited on

Cap LLM iterations at 50 to prevent timeout on 8B models
f20bc34

kavin57447 commited on

Switch to Llama 3.1 8B + fix low-timestep crash (min 5000)
8d95050

kavin57447 commited on

Pin torch/transformers/peft versions to fix cache conflict
27c9425

kavin57447 commited on

Fix permission: mkdir after COPY, chmod /app
ee0ba57

kavin57447 commited on

Fix matplotlib permission + HF cache dirs
0eef0af

kavin57447 commited on

Add LLM RL training with Gemma 7B + LoRA
ee3dfa7

kavin57447 commited on

Fix Gradio 6.0 theme deprecation
1c86d42

kavin57447 commited on

Add Cloud Arena Mathematical Model RL environment
12263fa

kavin57447 commited on