Clover-Hill's picture
Update README.md
0213671 verified
metadata
license: apache-2.0
language:
  - pyt
base_model:
  - Qwen/Qwen2.5-0.5B

Model Description

This Memory Decoder model is trained on the Finance domain and can be adapted to enhance any model in the Qwen2 and Qwen2.5 families.

Paper: Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models

GitHub: https://github.com/LUMIA-Group/MemoryDecoder

Training & Evaluation Data

Finance Domain Dataset: yahoo_finance_stockmarket_news

Test Split: MemoryDecoder-domain-data

Performance Results

Qwen2 Family

Model Base Model Base + MemDec
Qwen2-0.5B 16.00 3.84
Qwen2-1.5B 10.96 3.61
Qwen2-7B 8.31 3.38
Qwen2-72B 6.62 3.20

Qwen2.5 Family

Model Base Model Base + MemDec
Qwen2.5-0.5B 16.04 3.87
Qwen2.5-1.5B 11.20 3.61
Qwen2.5-3B 9.83 3.52
Qwen2.5-7B 8.61 3.42
Qwen2.5-14B 7.60 3.31
Qwen2.5-32B 7.38 3.29
Qwen2.5-72B 6.80 3.23

Perplexity scores on Finance domain test set. Lower is better.

Citation

@article{cao2025memory,
  title={Memory decoder: A pretrained, plug-and-play memory for large language models},
  author={Cao, Jiaqi and Wang, Jiarui and Wei, Rubin and Guo, Qipeng and Chen, Kai and Zhou, Bowen and Lin, Zhouhan},
  journal={arXiv preprint arXiv:2508.09874},
  year={2025}
}

Contact

For questions and support: maximus.cao@outlook.com