Phase 13 Stage 1 v2: fix termination signal - reward penalties for non-termination, reduced max_tokens, stop_strings 5a432bb yashash04 commited on Apr 22
Phase 13 Stage 1: pivot to winner's pattern - TRL 0.29.0 + fast_inference=False + DAPO loss cf4ce7e yashash04 commited on Apr 22
Phase 13 Stage 1: fix Unicode encoding artifacts in notebook comments b0a62f6 yashash04 commited on Apr 22
Phase 13 Stage 1: drop vLLM, use Kaggle native torch 2.10, upgrade trl to 0.18+ (Kaggle dep compatibility) 7247cb4 yashash04 commited on Apr 22
Phase 13 Stage 1: apply all 7 notebook audit fixes (account labels + health check + hub namespace + explicit commenting) dec6785 yashash04 commited on Apr 22
Phase 13 Stage 1 prep: Kaggle runbook + Run 1 config pre-populated (notebook audit flags pending approval) 453c504 yashash04 commited on Apr 22