Gradio: fix invisible header and body text on HF dark embed. 293388d md896 commited on about 1 month ago
Space home: redirect / to HTML /demo; Gradio at /gradio; fix Gradio hero. f4ae3f3 md896 commited on about 1 month ago
Fix Space pip resolve: fastapi>=0.115.2, python-multipart>=0.0.18, pin gradio 5.50.0. d9b6c59 md896 commited on about 1 month ago
Pin Gradio >=5.7.1 for huggingface_hub 1.x (fixes HfFolder ImportError on Space). f7153ad md896 commited on about 1 month ago
Point /demo and Gradio at diagram-end-to-end-workflow.png (asset on Hub via Xet). 4c3e70f md896 commited on about 1 month ago
Restore optimized training diagnostics and reward curve images 35a3454 md896 commited on about 1 month ago
Ship polished Space UI with Gradio dashboard and evidence-rich demo. 029f9cf md896 commited on about 1 month ago
Upload artifacts/runs/20260426-064318-sample-rewards-32eval/sample_rewards_final.json with huggingface_hub 4724001 verified md896 commited on about 1 month ago
Fix TRL 0.18 compatibility: remove unsupported generation_kwargs; set safety flags on model.generation_config. 6083a40 md896 commited on Apr 25
Harden GRPO generation stability on CUDA: bf16 + eager attention + invalid-logit guards. 948530a md896 commited on Apr 25
Fix GRPO batch/generation mismatch: auto-adjust num_generations; set launcher default to 2. af54ccd md896 commited on Apr 25
Simplify HF training stack: remove unsloth/vllm path, use plain transformers AutoModel + single OpenEnv reward. e5262a1 md896 commited on Apr 25
Fix Unsloth startup: avoid pre-importing trl/transformers; mock vllm as real package modules. d21de11 md896 commited on Apr 25
Fix HF job startup: import unsloth first and shim vllm package metadata check. 1fdba13 md896 commited on Apr 25
Fix HF Job bootstrap: transformers>=4.51 for trl 0.18, datasets<4; simplify to colab-style OpenEnv SQL reward. ee30276 md896 commited on Apr 25
Fix HF Jobs bootstrap (pin transformers/trl, drop torchao stack); add reward and trainer JSONL logging; stabilize launch_job. ceee0e3 md896 commited on Apr 25
Fix: Mock vllm and llm_blender to stabilize GRPOTrainer in HF Jobs environment bc20ef9 md896 commited on Apr 25
Downgrade TRL to 0.22.2 to natively bypass experimental vllm dependencies 2eb9add md896 commited on Apr 25