Clover-Hill commited on
Commit
cd16a60
·
verified ·
1 Parent(s): 495519d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen2.5-0.5B
7
+ ---
8
+
9
+ ## Model Description
10
+
11
+ This Memory Decoder model is trained on the Biomedical domain and can be adapted to enhance any model in the Qwen2 and Qwen2.5 families.
12
+
13
+ **Paper:** [Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models](https://www.arxiv.org/abs/2508.09874)
14
+
15
+ **GitHub:** [https://github.com/LUMIA-Group/MemoryDecoder](https://github.com/LUMIA-Group/MemoryDecoder/tree/main)
16
+
17
+ ## Training & Evaluation Data
18
+
19
+ **Biomedical Domain Dataset:** [mimic_iii_diagnosis_anonymous](https://huggingface.co/datasets/Medilora/mimic_iii_diagnosis_anonymous)
20
+
21
+ **Test Split:** [MemoryDecoder-domain-data](https://huggingface.co/datasets/Clover-Hill/MemoryDecoder-domain-data)
22
+
23
+ ## Performance Results
24
+
25
+ ### Qwen2 Family
26
+
27
+ | Model | Base Model | Base + MemDec |
28
+ |-------|------------|---------------|
29
+ | Qwen2-0.5B | 18.41 | 3.75 |
30
+ | Qwen2-1.5B | 12.42 | 3.68 |
31
+ | Qwen2-7B | 8.36 | 3.59 |
32
+ | Qwen2-72B | 6.15 | 3.45 |
33
+
34
+ ### Qwen2.5 Family
35
+
36
+ | Model | Base Model | Base + MemDec |
37
+ |-------|------------|---------------|
38
+ | Qwen2.5-0.5B | 17.01 | 3.74 |
39
+ | Qwen2.5-1.5B | 11.33 | 3.67 |
40
+ | Qwen2.5-3B | 9.70 | 3.63 |
41
+ | Qwen2.5-7B | 8.19 | 3.57 |
42
+ | Qwen2.5-14B | 7.01 | 3.51 |
43
+ | Qwen2.5-32B | 6.65 | 3.48 |
44
+ | Qwen2.5-72B | 5.90 | 3.44 |
45
+
46
+ *Perplexity scores on Biomedical domain test set. Lower is better.*
47
+
48
+ ## Citation
49
+
50
+ ```bibtex
51
+ @article{cao2025memory,
52
+ title={Memory decoder: A pretrained, plug-and-play memory for large language models},
53
+ author={Cao, Jiaqi and Wang, Jiarui and Wei, Rubin and Guo, Qipeng and Chen, Kai and Zhou, Bowen and Lin, Zhouhan},
54
+ journal={arXiv preprint arXiv:2508.09874},
55
+ year={2025}
56
+ }
57
+ ```
58
+
59
+ ## Contact
60
+
61
+ For questions and support: maximus.cao@outlook.com