Spaces:

RoleCall
/

rolecall-studios

Running

App Files Files Community

rolecall-studios / README.md

RoleCall Studios

Lead with PlotPoints benchmark and explain methodology for the leaderboard crowd

2527adf 26 days ago

preview code

Raw

History Blame Contribute Delete

1.29 kB

metadata

title: RoleCall Studios
emoji: 🎭
colorFrom: purple
colorTo: red
sdk: static
pinned: true
short_description: Interactive fiction, discovery & the PlotPoints RP benchmark

RoleCall Studios

Home of PlotPoints, an open, reproducible benchmark for how well models roleplay: human blind votes plus an LLM-judge rubric plus adversarial probes, with the test harness, prompts, rubric and raw votes all open-source (CC-BY 4.0). Nothing private. Alongside it: RoleCall, a SillyTavern-grade roleplay studio in your browser, and PlotLight, our discovery floor.

The Studio (RoleCall): https://rolecallstudios.com
Discovery (PlotLight): https://plotlightstudios.com
PlotPoints results: https://plotlightstudios.com/plotpoints
Vote (multi-turn arena): https://plotlightstudios.com/plotpoints/multiturn
Methodology: https://plotlightstudios.com/plotpoints/methodology
Dataset (HF, CC-BY 4.0): https://huggingface.co/datasets/lazyweasel/roleplay-bench
Source (GitHub): https://github.com/LeviTheWeasel/rp-benchmark
Discord: https://discord.gg/aFvkTCDRtf
Reddit: https://www.reddit.com/r/RoleCallStudios/

This Space is a static landing page. Use the Community tab to open a discussion or request a model for the benchmark.