Ship polished Space UI with Gradio dashboard and evidence-rich demo. 029f9cf md896 commited on 30 days ago
Upload artifacts/runs/20260426-064318-sample-rewards-32eval/sample_rewards_final.json with huggingface_hub 4724001 verified md896 commited on 30 days ago
Harden HF job token wiring and persist full training outputs 9552aaf md896 commited on about 1 month ago
Fix TRL 0.18 compatibility: remove unsupported generation_kwargs; set safety flags on model.generation_config. 6083a40 md896 commited on about 1 month ago
Harden GRPO generation stability on CUDA: bf16 + eager attention + invalid-logit guards. 948530a md896 commited on about 1 month ago
Fix GRPO batch/generation mismatch: auto-adjust num_generations; set launcher default to 2. af54ccd md896 commited on about 1 month ago
Simplify HF training stack: remove unsloth/vllm path, use plain transformers AutoModel + single OpenEnv reward. e5262a1 md896 commited on about 1 month ago
Fix Unsloth startup: avoid pre-importing trl/transformers; mock vllm as real package modules. d21de11 md896 commited on about 1 month ago
Fix HF job startup: import unsloth first and shim vllm package metadata check. 1fdba13 md896 commited on about 1 month ago
Fix HF Job bootstrap: transformers>=4.51 for trl 0.18, datasets<4; simplify to colab-style OpenEnv SQL reward. ee30276 md896 commited on about 1 month ago
Fix HF Jobs bootstrap (pin transformers/trl, drop torchao stack); add reward and trainer JSONL logging; stabilize launch_job. ceee0e3 md896 commited on about 1 month ago
Fix: Mock vllm and llm_blender to stabilize GRPOTrainer in HF Jobs environment bc20ef9 md896 commited on about 1 month ago
Downgrade TRL to 0.22.2 to natively bypass experimental vllm dependencies 2eb9add md896 commited on about 1 month ago
Fix vllm error cleanly by creating fake python module structure b2ce6c6 md896 commited on about 1 month ago
Add vllm to dependencies to fix TRL's hard import requirement 711ae38 md896 commited on about 1 month ago
Remove vllm mock to fix importlib find_spec crash in TRL 0.23 97cddc4 md896 commited on about 1 month ago