kkknight commited on
Commit
6c60a98
·
verified ·
1 Parent(s): 0125fb6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -72
README.md CHANGED
@@ -31,80 +31,57 @@ These checkpoints accompany the paper:
31
  ---
32
 
33
  ## 🗺️ Table of Contents
34
- 1. Model Summary
35
- 2. Intended Uses & Limitations
36
- 3. Quick Start
37
- 4. Training & Evaluation Details
38
- 5. Citation
39
- 6. Acknowledgements
40
 
41
  ---
42
 
43
- ## 1️⃣ Model Summary
44
-
45
- MiniOneRec rewrites every catalogue item into a discrete **SID token**:
46
-
47
- 1. **Text Encoder** (frozen PLM) →
48
- 2. **3-level Residual Quantisation (RQ-VAE / RQ-KMeans)** SID.
49
-
50
- User history SID sequence.
51
- Training pipeline:
52
-
53
- |
54
- Stage
55
- |
56
- Objective
57
- |
58
- Notes
59
- |
60
- |-------|-----------|-------|
61
- | **SFT** | Next-SID prediction + language alignment | inherits world knowledge while grounding in item space |
62
- | **RL (GRPO)** | KL-regularised policy optimisation | constrained beam search over the closed SID set |
63
-
64
- ### Released checkpoints (examples)
65
-
66
- |
67
- Checkpoint
68
- |
69
- Base LLM
70
- |
71
- #
72
- Params
73
- |
74
- Precision
75
- |
76
- Stage
77
- |
78
- |-------------------------------------|---------------------|---------|-----------|-----------|
79
- | `MiniOneRec-SFT-industrial` | Qwen-7B | 7 B | bf16 | SFT |
80
- | `MiniOneRec-RL-industrial` | Qwen-7B | 7 B | bf16 | SFT+RL |
81
-
82
- *(Replace with the exact repo names you upload.)*
83
 
84
  ---
85
 
86
- ## 2️⃣ Intended Uses & Limitations
87
 
88
- ### ✅ Intended
 
 
89
 
90
- * Next-item prediction in e-commerce / content platforms.
91
- * Research on generative recommendation and RL-from-human-feedback variants.
92
 
93
- ### Out-of-Scope
94
 
95
- * Safety-critical deployments without exhaustive evaluation.
96
- * Domains whose item catalogue is not covered by the released SID vocabulary.
97
- * Generation of content that violates the Apache-2.0 license or local regulations.
98
 
99
- ### ⚖️ Ethical Considerations
 
100
 
101
- The model may inherit bias from the training corpus (user behaviour, language model).
102
- Please **audit for fairness, privacy and potential leakage** before production use.
 
 
 
 
 
 
 
103
 
104
  ---
105
 
 
106
 
107
- ## 3️⃣ Citation
108
 
109
  ```
110
  @misc{MiniOneRec,
@@ -123,17 +100,3 @@ Please **audit for fairness, privacy and potential leakage** before production u
123
  year = {2025}
124
  }
125
  ```
126
- ## 4️⃣ Acknowledgements
127
- This repository reuses or adapts portions of code from the following open-source projects. We gratefully acknowledge their authors and contributors:
128
-
129
- - [ReRe](https://github.com/sober-clever/ReRe)
130
- - [LC-Rec](https://github.com/zhengbw0324/LC-Rec)
131
-
132
-
133
- ## 5️⃣ Institutions <!-- omit in toc -->
134
-
135
- This project is developed by the following institutions:
136
-
137
- - <img src="assets/lds.png" width="28px"> [LDS](https://data-science.ustc.edu.cn/_upload/tpl/15/04/5380/template5380/index.html)
138
- - <img src="assets/alphalab.jpg" width="28px"> [AlphaLab](https://alphalab-ustc.github.io/index.html)
139
- - <img src="assets/next.jpg" width="28px"> [NExT](https://www.nextcenter.org/)
 
31
  ---
32
 
33
  ## 🗺️ Table of Contents
34
+ 1. Key Techniques
35
+ 2. Evaluation
36
+ 3. Acknowledgements
37
+ 4. Institutions
38
+ 5. Citation
 
39
 
40
  ---
41
 
42
+ ## 1️⃣ Key Techniques
43
+ <div align="center">
44
+ <img src="./assets/minionerec_framework.png" width=100% ></img>
45
+ </div>
46
+
47
+ - **SID Construction: MiniOneRec begins by transforming every product into a compact, semantically meaningful token.** It concatenates an item’s title and description, feeds this sentence through a frozen text encoder, and then quantises the resulting embedding with a three-level RQ-VAE.
48
+
49
+ - **SFT: With all items rewritten as SIDs, the model is first trained in a supervised fashion.** It views the chronologically ordered user history as a token sequence and learns, via next-token prediction, to generate the SID of the next product the user is likely to consume. Crucially, this stage is co-trained with a set of language-alignment objectives that map back and forth between natural language and SID space, allowing the recommender to inherit the world knowledge embedded in large language models while grounding that knowledge in discrete item codes.
50
+
51
+ - **Recommendation-Oriented RL: After SFT, MiniOneRec is further polished with a recommendation-oriented RL phase based on GRPO.** Multiple candidate recommendations are generated for each prompt, their rewards are normalised within the group to stabilise gradients, and a KL penalty keeps the updated policy close to its reference. Because the action space is a closed list of item SIDs, the system switches to constrained beam search, which guarantees that every beam is unique and valid, greatly improving sampling efficiency and diversity. The reward signal itself blends a binary correctness term with a rank-aware component that penalises high-probability yet incorrect items more heavily, and can be augmented with collaborative-filtering scores. Together, this pipeline enables MiniOneRec to couple dense linguistic knowledge, achieving a high-performance, lightweight generative recommendation system.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ---
54
 
55
+ ## 2️⃣ Evaluation
56
 
57
+ <div align="center">
58
+ <img src="./assets/minionerec_main_result.png" width=100% ></img>
59
+ </div>
60
 
61
+ ---
 
62
 
63
+ ## 3️⃣ Acknowledgements
64
 
65
+ This repository reuses or adapts portions of code from the following open-source projects. We gratefully acknowledge their authors and contributors:
 
 
66
 
67
+ - [ReRe](https://github.com/sober-clever/ReRe)
68
+ - [LC-Rec](https://github.com/zhengbw0324/LC-Rec)
69
 
70
+ ---
71
+
72
+ ## 4️⃣ Institutions <!-- omit in toc -->
73
+
74
+ This project is developed by the following institutions:
75
+
76
+ - <img src="assets/lds.png" width="28px"> [LDS](https://data-science.ustc.edu.cn/_upload/tpl/15/04/5380/template5380/index.html)
77
+ - <img src="assets/alphalab.jpg" width="28px"> [AlphaLab](https://alphalab-ustc.github.io/index.html)
78
+ - <img src="assets/next.jpg" width="28px"> [NExT](https://www.nextcenter.org/)
79
 
80
  ---
81
 
82
+ ## 5️⃣ Citation
83
 
84
+ If you find our code/paper/model helpful, please consider citing our papers 📝 and staring us ⭐️!
85
 
86
  ```
87
  @misc{MiniOneRec,
 
100
  year = {2025}
101
  }
102
  ```