Spaces:
Running
Running
RoleCall Studios
Lead with PlotPoints benchmark and explain methodology for the leaderboard crowd
2527adf metadata
title: RoleCall Studios
emoji: 🎭
colorFrom: purple
colorTo: red
sdk: static
pinned: true
short_description: Interactive fiction, discovery & the PlotPoints RP benchmark
RoleCall Studios
Home of PlotPoints, an open, reproducible benchmark for how well models roleplay: human blind votes plus an LLM-judge rubric plus adversarial probes, with the test harness, prompts, rubric and raw votes all open-source (CC-BY 4.0). Nothing private. Alongside it: RoleCall, a SillyTavern-grade roleplay studio in your browser, and PlotLight, our discovery floor.
- The Studio (RoleCall): https://rolecallstudios.com
- Discovery (PlotLight): https://plotlightstudios.com
- PlotPoints results: https://plotlightstudios.com/plotpoints
- Vote (multi-turn arena): https://plotlightstudios.com/plotpoints/multiturn
- Methodology: https://plotlightstudios.com/plotpoints/methodology
- Dataset (HF, CC-BY 4.0): https://huggingface.co/datasets/lazyweasel/roleplay-bench
- Source (GitHub): https://github.com/LeviTheWeasel/rp-benchmark
- Discord: https://discord.gg/aFvkTCDRtf
- Reddit: https://www.reddit.com/r/RoleCallStudios/
This Space is a static landing page. Use the Community tab to open a discussion or request a model for the benchmark.