YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Automatic Evaluation Model for RAIDEN Benchmark
This repository contains the automated evaluation model trained as part of the research presented in the paper "RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues".
The model is designed to compare the quality of two different responses in a given dialogue turn and produce one of three evaluation outcomes: win , tie , or lose .
For more detailed information, please refer to our paper and code:
Paper: https://aclanthology.org/2025.coling-main.735.pdf
GitHub repo: https://github.com/FrontierLabs/RAIDEN
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support