|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- Qwen/Qwen3-4B-Instruct |
|
|
--- |
|
|
|
|
|
## Model Description |
|
|
|
|
|
**Mem-T-4B** refers to the model parameters derived from training Qwen3-4B-Instruct using **MoT-GRPO** within the **Mem-T** framework. |
|
|
|
|
|
|
|
|
|
|
|
## Usage |
|
|
|
|
|
For detailed instructions on how to use within the **Mem-T** framework, please refer to the main [Mem-T GitHub repository](https://github.com/yanweiyue/Mem-T). |
|
|
|
|
|
|
|
|
## Links |
|
|
|
|
|
* **GitHub:** [https://github.com/yanweiyue/Mem-T](https://github.com/yanweiyue/Mem-T) |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you find this work useful, please consider citing our paper. |
|
|
|
|
|
``` |
|
|
@misc{yue2026memtdensifyingrewardslonghorizon, |
|
|
title={Mem-T: Densifying Rewards for Long-Horizon Memory Agents}, |
|
|
author={Yanwei Yue and Guibin Zhang and Boci Peng and Xuanbo Fan and Jiaxin Guo and Qiankun Li and Yan Zhang}, |
|
|
year={2026}, |
|
|
eprint={2601.23014}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.LG}, |
|
|
url={https://arxiv.org/abs/2601.23014}, |
|
|
} |
|
|
``` |
|
|
|