English
Indonesian
art
nielsr HF Staff commited on
Commit
1310508
·
verified ·
1 Parent(s): f1215d4

Update model card with paper, code, and metadata

Browse files

Hi! I'm Niels from the community science team at Hugging Face. I've opened this PR to improve the model card for RubricHub.

Key changes:
- Updated the `pipeline_tag` to `text-generation`.
- Updated the license to `apache-2.0` based on the official GitHub repository.
- Added links to the paper, the GitHub code repository, and the RubricHub dataset.
- Added a summary of the methodology (automated Coarse-to-Fine Rubric Generation, RuFT, and RuRL) introduced in the paper.
- Included the BibTeX citation.

Files changed (1) hide show
  1. README.md +37 -4
README.md CHANGED
@@ -1,15 +1,48 @@
1
  ---
2
- license: bsd-2-clause
 
3
  datasets:
4
  - HuggingFaceFW/finetranslations
5
  - sojuL/RubricHub_v1
6
  language:
7
  - en
8
  - id
 
9
  metrics:
10
  - accuracy
11
- base_model:
12
- - zai-org/GLM-Image
13
  tags:
14
  - art
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - zai-org/GLM-Image
4
  datasets:
5
  - HuggingFaceFW/finetranslations
6
  - sojuL/RubricHub_v1
7
  language:
8
  - en
9
  - id
10
+ license: apache-2.0
11
  metrics:
12
  - accuracy
 
 
13
  tags:
14
  - art
15
+ - rubric
16
+ - reinforcement-learning
17
+ pipeline_tag: text-generation
18
+ ---
19
+
20
+ # RubricHub
21
+
22
+ This repository contains the model associated with the paper [RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation](https://huggingface.co/papers/2601.08430).
23
+
24
+ ## Introduction
25
+ RubricHub introduces a large-scale (~110k) and multi-domain rubric dataset designed to enhance Reinforcement Learning with Verifiable Rewards (RLVR) for open-ended generation. Since open-ended generation often lacks ground truth, RubricHub provides a structured proxy for verification using an automated **Coarse-to-Fine Rubric Generation** framework.
26
+
27
+ The model in this repository is part of a two-stage post-training pipeline:
28
+ 1. **RuFT (Rubric-based Rejection Sampling Fine-Tuning)**: Using rubric scores as filters.
29
+ 2. **RuRL (Rubric-based Reinforcement Learning)**: Using rubric scores as dense rewards.
30
+
31
+ The post-trained Qwen3-14B model using this framework achieves state-of-the-art results on HealthBench, surpassing proprietary models like GPT-5.
32
+
33
+ ## Resources
34
+ - **Paper:** [arXiv:2601.08430](https://arxiv.org/abs/2601.08430)
35
+ - **Code:** [GitHub - teqkilla/RubricHub](https://github.com/teqkilla/RubricHub)
36
+ - **Dataset:** [RubricHub_v1 on Hugging Face](https://huggingface.co/datasets/sojuL/RubricHub_v1)
37
+
38
+ ## Citation
39
+ If you find RubricHub useful for your research, please cite:
40
+
41
+ ```bibtex
42
+ @article{li2026rubrichub,
43
+ title={RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation},
44
+ author={Li, Sunzhu and Zhao, Jiale and Wei, Miteto and Ren, Huimin and Zhou, Yang and {Jingwen Yang} and Liu, Shunyu and Zhang, Kaike and Chen, Wei},
45
+ journal={arXiv preprint arXiv:2601.08430},
46
+ year={2026}
47
+ }
48
+ ```