Agent-STAR-RL-1.5B

This repository contains the Agent-STAR-RL-1.5B model, which is part of the research presented in the paper "Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe".

Agent-STAR is a systematic study of the reinforcement learning (RL) design space for long-horizon tool-using agents using the TravelPlanner testbed. The model is trained using the STAR pipeline: Data Synthesis → SFT → RL.

Model Details

According to the paper's findings, smaller models like this 1.5B variant benefit from scale-aware recipes including staged (curriculum-style) rewards and enhanced exploration to handle the complex constraints of multi-turn environments.

Usage

To run ReAct inference using the official implementation, you can use the following command structure:

cd Inference
python3 -u main.py \
  --model xxwu/Agent-STAR-RL-1.5B \
  --save_suffix your_suffix \
  --max_workers 20 \
  --split validation \
  --max_context 32768 \
  --max_turns 60 

Note: You will need to prepare the travel database as described in the GitHub repository.

Citation

If you find Agent-STAR helpful to your work, please cite the following:

@misc{wu2026agentstar,
      title={Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe}, 
      author={Xixi Wu and Qianguo Sun and Ruiyang Zhang and Chao Song and Junlong Wu and Yiyan Qi and Hong Cheng},
      year={2026},
      eprint={2603.21972},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.21972}, 
}

Acknowledgements

We thank the authors of TravelPlanner for their benchmark and the rLLM framework contributors for supporting the RL training process.

Downloads last month
262
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xxwu/Agent-STAR-RL-1.5B

Finetuned
(1435)
this model
Quantizations
2 models

Collection including xxwu/Agent-STAR-RL-1.5B

Paper for xxwu/Agent-STAR-RL-1.5B