AI-Response-Comparer-v1.6

AI-Response-Comparer-v1.6 is a fine-tuned version of microsoft/deberta-v3-large for preference classification and reward modeling tasks.

The model compares two AI-generated responses for the same prompt and predicts a probability distribution over three outcomes:

  • Response A preferred
  • Response B preferred
  • Tie

The output is generated using a 3-class softmax head, where probabilities sum to 1.


Model Details

Base Model

  • microsoft/deberta-v3-large

Fine-tuning Strategy

  • Full fine-tuning
  • Learning rate: 1e-5
  • Epochs: 1
  • Mixed-dataset training
  • Datasets shuffled during training

The model was trained on combined conversational preference datasets and evaluated separately on each dataset split.


Preprocessing Strategy

To maintain consistent input lengths and manageable training compute requirements:

  • Conversations were limited to a maximum of 2 turns
  • Inputs were truncated to a maximum sequence length of 512 tokens

These preprocessing rules were applied consistently across both training and evaluation datasets.


Training Datasets

Included Datasets

Evaluation Methodology

Anthropic HH-RLHF

  • Official provided train/test split used

LMSYS + Kaggle

  • 80/20 train-test split

All evaluations were performed independently per dataset after mixed-dataset training.


Performance

Dataset Test Samples Accuracy Precision (Macro) Recall (Macro) F1 Score (Macro)
Anthropic HH-RLHF 4,923 67.21% 44.84% 44.81% 44.82%
Kaggle LLM Classification 8,480 50.27% 50.08% 50.02% 49.75%
LMSYS Chatbot Arena 5,691 56.96% 55.78% 55.91% 55.62%

Intended Use

This model is intended for:

  • Reward modeling
  • Preference modeling
  • AI response ranking
  • Human preference approximation
  • LLM evaluation pipelines
  • RLHF experimentation
  • AI-generated response comparison

Limitations

  • Primarily trained on English conversational data
  • Limited to short conversational windows (2 turns)
  • Not optimized for long-context reasoning
  • Preference labels may inherit annotator bias
  • Performance may vary significantly across domains and model families
  • Not calibrated for safety-critical or production moderation systems

License

Model Weights

This repository includes datasets with non-commercial licensing restrictions.

Therefore:

  • Model weights are licensed under:
    • CC BY-NC 4.0

Commercial usage of the trained weights is not permitted without ensuring compliance with upstream dataset licenses.

Source Code

  • Training scripts and source code are licensed under:
    • Apache-2.0

Attribution

Base Model

  • Microsoft DeBERTa-v3-large

Datasets

  • Anthropic HH-RLHF
  • LMSYS Chatbot Arena
  • Kaggle LLM Classification Finetuning

Citation

@misc{himanshu2026airesponsecomparerv16,
  title={AI-Response-Comparer-v1.6},
  author={Himanshu Bansal},
  year={2026},
  publisher={Hugging Face},
  howpublished={https://huggingface.co/Himanshu167/AI-Response-Comparer-v1.6}
}
Downloads last month
28
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Himanshu167/AI-Response-Comparer-v1.6

Finetuned
(271)
this model

Datasets used to train Himanshu167/AI-Response-Comparer-v1.6

Evaluation results

  • Accuracy on Anthropic HH-RLHF
    self-reported
    0.672
  • Precision Score (Macro) on Anthropic HH-RLHF
    self-reported
    0.448
  • Recall Score (Macro) on Anthropic HH-RLHF
    self-reported
    0.448
  • F1 Score (Macro) on Anthropic HH-RLHF
    self-reported
    0.448
  • Accuracy on Kaggle LLM Classification Finetuning
    self-reported
    0.503
  • Precision Score (Macro) on Kaggle LLM Classification Finetuning
    self-reported
    0.501
  • Recall Score (Macro) on Kaggle LLM Classification Finetuning
    self-reported
    0.500
  • F1 Score (Macro) on Kaggle LLM Classification Finetuning
    self-reported
    0.497