Spaces:
Running
Running
RoleCall Studios
Lead with PlotPoints benchmark and explain methodology for the leaderboard crowd
2527adf | title: RoleCall Studios | |
| emoji: 🎭 | |
| colorFrom: purple | |
| colorTo: red | |
| sdk: static | |
| pinned: true | |
| short_description: Interactive fiction, discovery & the PlotPoints RP benchmark | |
| # RoleCall Studios | |
| Home of **PlotPoints**, an open, reproducible benchmark for how well models | |
| *roleplay*: human blind votes plus an LLM-judge rubric plus adversarial probes, | |
| with the test harness, prompts, rubric and raw votes all open-source (CC-BY 4.0). | |
| Nothing private. Alongside it: **RoleCall**, a SillyTavern-grade roleplay studio | |
| in your browser, and **PlotLight**, our discovery floor. | |
| - **The Studio (RoleCall):** https://rolecallstudios.com | |
| - **Discovery (PlotLight):** https://plotlightstudios.com | |
| - **PlotPoints results:** https://plotlightstudios.com/plotpoints | |
| - **Vote (multi-turn arena):** https://plotlightstudios.com/plotpoints/multiturn | |
| - **Methodology:** https://plotlightstudios.com/plotpoints/methodology | |
| - **Dataset (HF, CC-BY 4.0):** https://huggingface.co/datasets/lazyweasel/roleplay-bench | |
| - **Source (GitHub):** https://github.com/LeviTheWeasel/rp-benchmark | |
| - **Discord:** https://discord.gg/aFvkTCDRtf | |
| - **Reddit:** https://www.reddit.com/r/RoleCallStudios/ | |
| This Space is a static landing page. Use the **Community** tab to open a | |
| discussion or request a model for the benchmark. | |