File size: 357 Bytes
0568071
eb18a7b
 
 
 
 
 
 
 
 
 
9ce004f
eb18a7b
0568071
eb18a7b
d738220
88e69f9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
---
license: apache-2.0
language:
- en
library_name: transformers
base_model: Qwen/Qwen3-4B
tags:
- mapf
- reinforcement-learning
- grpo
- qwen3
- arxiv:2606.17682
pipeline_tag: text-generation
---

This is the checkpoint of the
[From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning](https://arxiv.org/abs/2606.17682)