YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Automatic Evaluation Model for RAIDEN Benchmark

This repository contains the automated evaluation model trained as part of the research presented in the paper "RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues".

The model is designed to compare the quality of two different responses in a given dialogue turn and produce one of three evaluation outcomes: win , tie , or lose .

For more detailed information, please refer to our paper and code:

Paper: https://aclanthology.org/2025.coling-main.735.pdf
GitHub repo: https://github.com/FrontierLabs/RAIDEN

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support