Instructions to use sarosavo/Master-RM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sarosavo/Master-RM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="sarosavo/Master-RM")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sarosavo/Master-RM") model = AutoModelForCausalLM.from_pretrained("sarosavo/Master-RM") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,7 @@ This repository contains a robust, general-domain generative reward model presen
|
|
| 29 |
|
| 30 |
- **Paper**: [One Token to Fool LLM-as-a-Judge](https://huggingface.co/papers/2507.08794)
|
| 31 |
- **Training Data**: [https://huggingface.co/datasets/sarosavo/Master-RM](https://huggingface.co/datasets/sarosavo/Master-RM)
|
| 32 |
-
- **Code/GitHub Repository**: [https://github.com/Yulai-Zhao/Robust-Reward-Model](https://github.com/Yulai-Zhao/Robust-Reward-Model)
|
| 33 |
- **Training algorithm**: Standard supervised fine-tuning, see Appendix A.2 for more details.
|
| 34 |
|
| 35 |
## Model Description
|
|
|
|
| 29 |
|
| 30 |
- **Paper**: [One Token to Fool LLM-as-a-Judge](https://huggingface.co/papers/2507.08794)
|
| 31 |
- **Training Data**: [https://huggingface.co/datasets/sarosavo/Master-RM](https://huggingface.co/datasets/sarosavo/Master-RM)
|
| 32 |
+
<!-- - **Code/GitHub Repository**: [https://github.com/Yulai-Zhao/Robust-Reward-Model](https://github.com/Yulai-Zhao/Robust-Reward-Model) -->
|
| 33 |
- **Training algorithm**: Standard supervised fine-tuning, see Appendix A.2 for more details.
|
| 34 |
|
| 35 |
## Model Description
|