Mem-T: Densifying Rewards for Long-Horizon Memory Agents
Paper
•
2601.23014
•
Published
•
1
Mem-T-4B refers to the model parameters derived from training Qwen3-4B-Instruct using MoT-GRPO within the Mem-T framework.
For detailed instructions on how to use within the Mem-T framework, please refer to the main Mem-T GitHub repository.
If you find this work useful, please consider citing our paper.
@misc{yue2026memtdensifyingrewardslonghorizon,
title={Mem-T: Densifying Rewards for Long-Horizon Memory Agents},
author={Yanwei Yue and Guibin Zhang and Boci Peng and Xuanbo Fan and Jiaxin Guo and Qiankun Li and Yan Zhang},
year={2026},
eprint={2601.23014},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2601.23014},
}