ModalityDance
/

MRM-Reddit100-V2

Text Classification

Model card Files Files and versions

HongruCai commited on 13 days ago

Commit

81bb12d

·

verified ·

1 Parent(s): 5a2d613

Update README.md

Files changed (1) hide show

README.md +80 -1

README.md CHANGED Viewed

@@ -4,4 +4,83 @@ datasets:
 - openai/summarize_from_feedback
 base_model:
 - Skywork/Skywork-Reward-V2-Llama-3.1-8B
----

 - openai/summarize_from_feedback
 base_model:
 - Skywork/Skywork-Reward-V2-Llama-3.1-8B
+---
+---
+license: mit
+datasets:
+- openai/summarize_from_feedback
+metrics:
+- accuracy
+base_model:
+- Skywork/Skywork-Reward-V2-Llama-3.1-8B
+---
+# Meta Reward Modeling (MRM)
+## Overview
+**Meta Reward Modeling (MRM)** is a personalized reward modeling framework designed to adapt to diverse user preferences with limited feedback.
+Instead of learning a single global reward function, MRM treats each user as a separate learning task and applies a meta-learning approach to learn a shared initialization that enables fast, few-shot personalization.
+MRM represents user-specific rewards as adaptive combinations over shared base reward functions and optimizes this structure through a bi-level meta-learning framework.
+To improve robustness across heterogeneous users, MRM introduces a **Robust Personalization Objective (RPO)** that emphasizes hard-to-learn users during meta-training.
+This repository provides trained checkpoints for reward modeling and user-level preference evaluation.
+---
+## Links
+- 📄 **arXiv Paper**: https://arxiv.org/abs/XXXX.XXXXX
+- 🤗 **Hugging Face Paper**: https://huggingface.co/papers/XXXX.XXXXX
+- 💻 **GitHub Code**: https://github.com/ModalityDance/MRM
+- 📦 **Hugging Face Collection**: https://huggingface.co/collections/ModalityDance/mrm
+---
+## Evaluation
+The model is evaluated using user-level preference accuracy with few-shot personalization.
+Inference follows the same adaptation procedure used during training: for each user, the reward weights are initialized from the meta-learned initialization and updated with a small number of gradient steps on user-specific preference data.
+### Example evaluation script
+```bash
+python inference.py \
+  --embed_pt data/emb/reddit/V2.pt \
+  --meta_json data/emb/reddit/V2.json \
+  --ckpt path/to/checkpoint.pt \
+  --dataset REDDIT \
+  --seen_train_limit 100 \
+  --unseen_train_limit 50 \
+  --hidden_layers 2 \
+  --inner_lr 5e-3 \
+  --eval_inner_epochs 1 \
+  --val_ratio 0.9 \
+  --score_threshold -1 \
+  --seed 42 \
+  --device cuda:0
+````
+---
+## Citation
+If you use this model or code in your research, please cite:
+```bibtex
+@article{mrm2025,
+  title   = {Meta Reward Modeling for Personalized Alignment},
+  author  = {Author Names},
+  journal = {arXiv preprint arXiv:XXXX.XXXXX},
+  year    = {2025}
+}
+```
+---
+## License
+This model is released under the **MIT License**.