---
license: apache-2.0
language:
- en
library_name: transformers
base_model: Qwen/Qwen3-4B
tags:
- mapf
- reinforcement-learning
- grpo
- qwen3
- arxiv:2606.17682
pipeline_tag: text-generation
---

This is the checkpoint of the
[From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning](https://arxiv.org/abs/2606.17682)