IcyFish's picture
Initial release: Qwen3-4B-EnvTuning (stage2, BFCL train set GRPO)
38dd865 verified