Text Classification
Transformers
nielsr HF Staff commited on
Commit
cf62c08
·
verified ·
1 Parent(s): 349204b

Add pipeline tag, library name, and paper link to metadata

Browse files

Hi! I'm Niels, part of the community science team at Hugging Face.

This PR improves the model card by adding:
- `pipeline_tag: text-classification`: This helps users find the model under the correct task category (Reward Modeling is typically classified as text classification on the Hub).
- `library_name: transformers`: Based on the usage example, the model is compatible with the `transformers` library.
- `arxiv: 2601.18731`: This links the model repository to its official paper on the Hugging Face Hub.

Best,
Niels

Files changed (1) hide show
  1. README.md +12 -10
README.md CHANGED
@@ -1,12 +1,14 @@
1
  ---
2
- license: mit
3
- datasets:
4
- - openai/summarize_from_feedback
5
  base_model:
6
  - Skywork/Skywork-Reward-Llama-3.1-8B-v0.2
 
 
 
 
 
 
7
  ---
8
 
9
-
10
  # Meta Reward Modeling (MRM)
11
 
12
  ## Overview
@@ -17,16 +19,16 @@ Instead of learning a single global reward function, MRM treats each user as a s
17
  MRM represents user-specific rewards as adaptive combinations over shared base reward functions and optimizes this structure through a bi-level meta-learning framework.
18
  To improve robustness across heterogeneous users, MRM introduces a **Robust Personalization Objective (RPO)** that emphasizes hard-to-learn users during meta-training.
19
 
20
- This repository provides trained checkpoints for reward modeling and user-level preference evaluation.
21
 
22
  ---
23
 
24
  ## Links
25
 
26
- - 📄 **arXiv Paper**: https://arxiv.org/abs/2601.18731
27
- - 🤗 **Hugging Face Paper**: https://huggingface.co/papers/2601.18731
28
- - 💻 **GitHub Code**: https://github.com/ModalityDance/MRM
29
- - 📦 **Hugging Face Collection**: https://huggingface.co/collections/ModalityDance/mrm
30
 
31
  ---
32
 
@@ -171,4 +173,4 @@ If you use this model or code in your research, please cite:
171
 
172
  ## License
173
 
174
- This model is released under the **MIT License**.
 
1
  ---
 
 
 
2
  base_model:
3
  - Skywork/Skywork-Reward-Llama-3.1-8B-v0.2
4
+ datasets:
5
+ - openai/summarize_from_feedback
6
+ license: mit
7
+ pipeline_tag: text-classification
8
+ library_name: transformers
9
+ arxiv: 2601.18731
10
  ---
11
 
 
12
  # Meta Reward Modeling (MRM)
13
 
14
  ## Overview
 
19
  MRM represents user-specific rewards as adaptive combinations over shared base reward functions and optimizes this structure through a bi-level meta-learning framework.
20
  To improve robustness across heterogeneous users, MRM introduces a **Robust Personalization Objective (RPO)** that emphasizes hard-to-learn users during meta-training.
21
 
22
+ This repository provides trained checkpoints for reward modeling and user-level preference evaluation as presented in the paper [One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment](https://huggingface.co/papers/2601.18731).
23
 
24
  ---
25
 
26
  ## Links
27
 
28
+ - 📄 **arXiv Paper**: [2601.18731](https://arxiv.org/abs/2601.18731)
29
+ - 🤗 **Hugging Face Paper**: [2601.18731](https://huggingface.co/papers/2601.18731)
30
+ - 💻 **GitHub Code**: [ModalityDance/MRM](https://github.com/ModalityDance/MRM)
31
+ - 📦 **Hugging Face Collection**: [MRM Collection](https://huggingface.co/collections/ModalityDance/mrm)
32
 
33
  ---
34
 
 
173
 
174
  ## License
175
 
176
+ This model is released under the **MIT License**.