rolecall-studios / README.md
RoleCall Studios
Lead with PlotPoints benchmark and explain methodology for the leaderboard crowd
2527adf
|
Raw
History Blame Contribute Delete
1.29 kB
---
title: RoleCall Studios
emoji: 🎭
colorFrom: purple
colorTo: red
sdk: static
pinned: true
short_description: Interactive fiction, discovery & the PlotPoints RP benchmark
---
# RoleCall Studios
Home of **PlotPoints**, an open, reproducible benchmark for how well models
*roleplay*: human blind votes plus an LLM-judge rubric plus adversarial probes,
with the test harness, prompts, rubric and raw votes all open-source (CC-BY 4.0).
Nothing private. Alongside it: **RoleCall**, a SillyTavern-grade roleplay studio
in your browser, and **PlotLight**, our discovery floor.
- **The Studio (RoleCall):** https://rolecallstudios.com
- **Discovery (PlotLight):** https://plotlightstudios.com
- **PlotPoints results:** https://plotlightstudios.com/plotpoints
- **Vote (multi-turn arena):** https://plotlightstudios.com/plotpoints/multiturn
- **Methodology:** https://plotlightstudios.com/plotpoints/methodology
- **Dataset (HF, CC-BY 4.0):** https://huggingface.co/datasets/lazyweasel/roleplay-bench
- **Source (GitHub):** https://github.com/LeviTheWeasel/rp-benchmark
- **Discord:** https://discord.gg/aFvkTCDRtf
- **Reddit:** https://www.reddit.com/r/RoleCallStudios/
This Space is a static landing page. Use the **Community** tab to open a
discussion or request a model for the benchmark.