Update model card with paper, code, and metadata
Browse filesHi! I'm Niels from the community science team at Hugging Face. I've opened this PR to improve the model card for RubricHub.
Key changes:
- Updated the `pipeline_tag` to `text-generation`.
- Updated the license to `apache-2.0` based on the official GitHub repository.
- Added links to the paper, the GitHub code repository, and the RubricHub dataset.
- Added a summary of the methodology (automated Coarse-to-Fine Rubric Generation, RuFT, and RuRL) introduced in the paper.
- Included the BibTeX citation.
README.md
CHANGED
|
@@ -1,15 +1,48 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- HuggingFaceFW/finetranslations
|
| 5 |
- sojuL/RubricHub_v1
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
- id
|
|
|
|
| 9 |
metrics:
|
| 10 |
- accuracy
|
| 11 |
-
base_model:
|
| 12 |
-
- zai-org/GLM-Image
|
| 13 |
tags:
|
| 14 |
- art
|
| 15 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- zai-org/GLM-Image
|
| 4 |
datasets:
|
| 5 |
- HuggingFaceFW/finetranslations
|
| 6 |
- sojuL/RubricHub_v1
|
| 7 |
language:
|
| 8 |
- en
|
| 9 |
- id
|
| 10 |
+
license: apache-2.0
|
| 11 |
metrics:
|
| 12 |
- accuracy
|
|
|
|
|
|
|
| 13 |
tags:
|
| 14 |
- art
|
| 15 |
+
- rubric
|
| 16 |
+
- reinforcement-learning
|
| 17 |
+
pipeline_tag: text-generation
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
# RubricHub
|
| 21 |
+
|
| 22 |
+
This repository contains the model associated with the paper [RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation](https://huggingface.co/papers/2601.08430).
|
| 23 |
+
|
| 24 |
+
## Introduction
|
| 25 |
+
RubricHub introduces a large-scale (~110k) and multi-domain rubric dataset designed to enhance Reinforcement Learning with Verifiable Rewards (RLVR) for open-ended generation. Since open-ended generation often lacks ground truth, RubricHub provides a structured proxy for verification using an automated **Coarse-to-Fine Rubric Generation** framework.
|
| 26 |
+
|
| 27 |
+
The model in this repository is part of a two-stage post-training pipeline:
|
| 28 |
+
1. **RuFT (Rubric-based Rejection Sampling Fine-Tuning)**: Using rubric scores as filters.
|
| 29 |
+
2. **RuRL (Rubric-based Reinforcement Learning)**: Using rubric scores as dense rewards.
|
| 30 |
+
|
| 31 |
+
The post-trained Qwen3-14B model using this framework achieves state-of-the-art results on HealthBench, surpassing proprietary models like GPT-5.
|
| 32 |
+
|
| 33 |
+
## Resources
|
| 34 |
+
- **Paper:** [arXiv:2601.08430](https://arxiv.org/abs/2601.08430)
|
| 35 |
+
- **Code:** [GitHub - teqkilla/RubricHub](https://github.com/teqkilla/RubricHub)
|
| 36 |
+
- **Dataset:** [RubricHub_v1 on Hugging Face](https://huggingface.co/datasets/sojuL/RubricHub_v1)
|
| 37 |
+
|
| 38 |
+
## Citation
|
| 39 |
+
If you find RubricHub useful for your research, please cite:
|
| 40 |
+
|
| 41 |
+
```bibtex
|
| 42 |
+
@article{li2026rubrichub,
|
| 43 |
+
title={RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation},
|
| 44 |
+
author={Li, Sunzhu and Zhao, Jiale and Wei, Miteto and Ren, Huimin and Zhou, Yang and {Jingwen Yang} and Liu, Shunyu and Zhang, Kaike and Chen, Wei},
|
| 45 |
+
journal={arXiv preprint arXiv:2601.08430},
|
| 46 |
+
year={2026}
|
| 47 |
+
}
|
| 48 |
+
```
|