FrontierLab
/

RPCAJudger

Model card Files Files and versions

RPCAJudger / README.md

jasonlp's picture

Update README.md

6738396 verified about 1 year ago

|

history blame contribute delete

618 Bytes

	# Automatic Evaluation Model for RAIDEN Benchmark

	This repository contains the automated evaluation model trained as part of the research presented in the paper "RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues".

	The model is designed to compare the quality of two different responses in a given dialogue turn and produce one of three evaluation outcomes: win , tie , or lose .

	For more detailed information, please refer to our paper and code:

	Paper: https://aclanthology.org/2025.coling-main.735.pdf
	GitHub repo: https://github.com/FrontierLabs/RAIDEN