LossFunctionLover
/

pairwise-orm-model

Text Classification

preference-learning

agentic-reasoning

outcome-reward-model

pairwise-preference

Eval Results (legacy)

Model card Files Files and versions

LossFunctionLover commited on Dec 19, 2025

Commit

108059a

·

verified ·

1 Parent(s): 68400a5

Update DATASET_CARD.md

Files changed (1) hide show

DATASET_CARD.md +5 -5

DATASET_CARD.md CHANGED Viewed

@@ -41,7 +41,7 @@ dataset_info:
 **High-Quality Reasoning Trace Preferences for Training Outcome Reward Models**
 [![Paper](https://img.shields.io/badge/Paper-ArXiv-red)](link-to-arxiv)
-[![Model](https://img.shields.io/badge/Model-HuggingFace-yellow)](https://huggingface.co/akleshmishra/pairwise-orm-model)
 </div>
@@ -381,13 +381,13 @@ Models trained on this dataset achieve:
 - **Anti-symmetry**: -0.998 correlation on label-swap test
 - **Length Robustness**: 95.5%-99.7% across token ranges
-See [Model Card](https://huggingface.co/akleshmishra/pairwise-orm-model) for full evaluation details.
 ## 🔗 Related Resources
 - 📄 **Paper**: [ArXiv](link-to-arxiv) - "An Empirical Study of Robust Preference Learning under Minimal Supervision"
-- 🤖 **Trained Model**: [HuggingFace](https://huggingface.co/akleshmishra/pairwise-orm-model)
-- 💻 **Training Code**: [GitHub](your-github-repo-url)
 - 📊 **Source Pointwise Dataset**: Available upon request
 ## 📧 Contact & Citation
@@ -403,7 +403,7 @@ If you use this dataset, please cite:
   title = {ORM Pairwise Preference Pairs: A Curated Dataset for Training Outcome Reward Models},
   year = {2025},
   publisher = {HuggingFace},
-  howpublished = {\url{https://huggingface.co/datasets/akleshmishra/orm-pairwise-preference-pairs}}
 }
 @article{mishra2025orm,

 **High-Quality Reasoning Trace Preferences for Training Outcome Reward Models**
 [![Paper](https://img.shields.io/badge/Paper-ArXiv-red)](link-to-arxiv)
+[![Model](https://img.shields.io/badge/Model-HuggingFace-yellow)](https://huggingface.co/LossFunctionLover/pairwise-orm-model)
 </div>
 - **Anti-symmetry**: -0.998 correlation on label-swap test
 - **Length Robustness**: 95.5%-99.7% across token ranges
+See [Model Card](https://huggingface.co/LossFunctionLover/pairwise-orm-model) for full evaluation details.
 ## 🔗 Related Resources
 - 📄 **Paper**: [ArXiv](link-to-arxiv) - "An Empirical Study of Robust Preference Learning under Minimal Supervision"
+- 🤖 **Trained Model**: [HuggingFace](https://huggingface.co/LossFunctionLover/pairwise-orm-model)
+- 💻 **Training Code**: [GitHub](https://github.com/Coder-12)
 - 📊 **Source Pointwise Dataset**: Available upon request
 ## 📧 Contact & Citation
   title = {ORM Pairwise Preference Pairs: A Curated Dataset for Training Outcome Reward Models},
   year = {2025},
   publisher = {HuggingFace},
+  howpublished = {\url{https://huggingface.co/datasets/LossFunctionLover/orm-pairwise-preference-pairs}}
 }
 @article{mishra2025orm,