Update DATASET_CARD.md
Browse files- DATASET_CARD.md +5 -5
DATASET_CARD.md
CHANGED
|
@@ -41,7 +41,7 @@ dataset_info:
|
|
| 41 |
**High-Quality Reasoning Trace Preferences for Training Outcome Reward Models**
|
| 42 |
|
| 43 |
[](link-to-arxiv)
|
| 44 |
-
[](https://huggingface.co/
|
| 45 |
|
| 46 |
</div>
|
| 47 |
|
|
@@ -381,13 +381,13 @@ Models trained on this dataset achieve:
|
|
| 381 |
- **Anti-symmetry**: -0.998 correlation on label-swap test
|
| 382 |
- **Length Robustness**: 95.5%-99.7% across token ranges
|
| 383 |
|
| 384 |
-
See [Model Card](https://huggingface.co/
|
| 385 |
|
| 386 |
## π Related Resources
|
| 387 |
|
| 388 |
- π **Paper**: [ArXiv](link-to-arxiv) - "An Empirical Study of Robust Preference Learning under Minimal Supervision"
|
| 389 |
-
- π€ **Trained Model**: [HuggingFace](https://huggingface.co/
|
| 390 |
-
- π» **Training Code**: [GitHub](
|
| 391 |
- π **Source Pointwise Dataset**: Available upon request
|
| 392 |
|
| 393 |
## π§ Contact & Citation
|
|
@@ -403,7 +403,7 @@ If you use this dataset, please cite:
|
|
| 403 |
title = {ORM Pairwise Preference Pairs: A Curated Dataset for Training Outcome Reward Models},
|
| 404 |
year = {2025},
|
| 405 |
publisher = {HuggingFace},
|
| 406 |
-
howpublished = {\url{https://huggingface.co/datasets/
|
| 407 |
}
|
| 408 |
|
| 409 |
@article{mishra2025orm,
|
|
|
|
| 41 |
**High-Quality Reasoning Trace Preferences for Training Outcome Reward Models**
|
| 42 |
|
| 43 |
[](link-to-arxiv)
|
| 44 |
+
[](https://huggingface.co/LossFunctionLover/pairwise-orm-model)
|
| 45 |
|
| 46 |
</div>
|
| 47 |
|
|
|
|
| 381 |
- **Anti-symmetry**: -0.998 correlation on label-swap test
|
| 382 |
- **Length Robustness**: 95.5%-99.7% across token ranges
|
| 383 |
|
| 384 |
+
See [Model Card](https://huggingface.co/LossFunctionLover/pairwise-orm-model) for full evaluation details.
|
| 385 |
|
| 386 |
## π Related Resources
|
| 387 |
|
| 388 |
- π **Paper**: [ArXiv](link-to-arxiv) - "An Empirical Study of Robust Preference Learning under Minimal Supervision"
|
| 389 |
+
- π€ **Trained Model**: [HuggingFace](https://huggingface.co/LossFunctionLover/pairwise-orm-model)
|
| 390 |
+
- π» **Training Code**: [GitHub](https://github.com/Coder-12)
|
| 391 |
- π **Source Pointwise Dataset**: Available upon request
|
| 392 |
|
| 393 |
## π§ Contact & Citation
|
|
|
|
| 403 |
title = {ORM Pairwise Preference Pairs: A Curated Dataset for Training Outcome Reward Models},
|
| 404 |
year = {2025},
|
| 405 |
publisher = {HuggingFace},
|
| 406 |
+
howpublished = {\url{https://huggingface.co/datasets/LossFunctionLover/orm-pairwise-preference-pairs}}
|
| 407 |
}
|
| 408 |
|
| 409 |
@article{mishra2025orm,
|