Trainee2Trainer / README.md
FYYDCC's picture
Update README.md
88e69f9 verified
|
Raw
History Blame Contribute Delete
357 Bytes
metadata
license: apache-2.0
language:
  - en
library_name: transformers
base_model: Qwen/Qwen3-4B
tags:
  - mapf
  - reinforcement-learning
  - grpo
  - qwen3
  - arxiv:2606.17682
pipeline_tag: text-generation

This is the checkpoint of the From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning