AI-Response-Comparer-v1.6

AI-Response-Comparer-v1.6 is a fine-tuned version of microsoft/deberta-v3-large for preference classification and reward modeling tasks.

The model compares two AI-generated responses for the same prompt and predicts a probability distribution over three outcomes:

Response A preferred
Response B preferred
Tie

The output is generated using a 3-class softmax head, where probabilities sum to 1.

Model Details

Base Model

microsoft/deberta-v3-large

Fine-tuning Strategy

Full fine-tuning
Learning rate: 1e-5
Epochs: 1
Mixed-dataset training
Datasets shuffled during training

The model was trained on combined conversational preference datasets and evaluated separately on each dataset split.

Preprocessing Strategy

To maintain consistent input lengths and manageable training compute requirements:

Conversations were limited to a maximum of 2 turns
Inputs were truncated to a maximum sequence length of 512 tokens

These preprocessing rules were applied consistently across both training and evaluation datasets.

Training Datasets

Included Datasets

Evaluation Methodology

Anthropic HH-RLHF

Official provided train/test split used

LMSYS + Kaggle

80/20 train-test split

All evaluations were performed independently per dataset after mixed-dataset training.

Performance

Dataset	Test Samples	Accuracy	Precision (Macro)	Recall (Macro)	F1 Score (Macro)
Anthropic HH-RLHF	4,923	67.21%	44.84%	44.81%	44.82%
Kaggle LLM Classification	8,480	50.27%	50.08%	50.02%	49.75%
LMSYS Chatbot Arena	5,691	56.96%	55.78%	55.91%	55.62%

Intended Use

This model is intended for:

Reward modeling
Preference modeling
AI response ranking
Human preference approximation
LLM evaluation pipelines
RLHF experimentation
AI-generated response comparison

Limitations

Primarily trained on English conversational data
Limited to short conversational windows (2 turns)
Not optimized for long-context reasoning
Preference labels may inherit annotator bias
Performance may vary significantly across domains and model families
Not calibrated for safety-critical or production moderation systems

License

Model Weights

This repository includes datasets with non-commercial licensing restrictions.

Therefore:

Model weights are licensed under:
- CC BY-NC 4.0

Commercial usage of the trained weights is not permitted without ensuring compliance with upstream dataset licenses.

Source Code

Training scripts and source code are licensed under:
- Apache-2.0

Attribution

Base Model

Microsoft DeBERTa-v3-large

Datasets

Anthropic HH-RLHF
LMSYS Chatbot Arena
Kaggle LLM Classification Finetuning

Citation

@misc{himanshu2026airesponsecomparerv16,
  title={AI-Response-Comparer-v1.6},
  author={Himanshu Bansal},
  year={2026},
  publisher={Hugging Face},
  howpublished={https://huggingface.co/Himanshu167/AI-Response-Comparer-v1.6}
}

Downloads last month: 28

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for Himanshu167/AI-Response-Comparer-v1.6

Base model

microsoft/deberta-v3-large

Finetuned

(271)

this model

Datasets used to train Himanshu167/AI-Response-Comparer-v1.6

Evaluation results

Accuracy on Anthropic HH-RLHF
self-reported

0.672
Precision Score (Macro) on Anthropic HH-RLHF
self-reported

0.448
Recall Score (Macro) on Anthropic HH-RLHF
self-reported

0.448
F1 Score (Macro) on Anthropic HH-RLHF
self-reported

0.448
Accuracy on Kaggle LLM Classification Finetuning
self-reported

0.503
Precision Score (Macro) on Kaggle LLM Classification Finetuning
self-reported

0.501
Recall Score (Macro) on Kaggle LLM Classification Finetuning
self-reported

0.500
F1 Score (Macro) on Kaggle LLM Classification Finetuning
self-reported

0.497