Spaces:

RoleCall
/

rolecall-studios

Running

RoleCall Studios

Lead with PlotPoints benchmark and explain methodology for the leaderboard crowd

2527adf 27 days ago

1.29 kB

	---
	title: RoleCall Studios
	emoji: 🎭
	colorFrom: purple
	colorTo: red
	sdk: static
	pinned: true
	short_description: Interactive fiction, discovery & the PlotPoints RP benchmark
	---

	# RoleCall Studios

	Home of PlotPoints, an open, reproducible benchmark for how well models
	roleplay: human blind votes plus an LLM-judge rubric plus adversarial probes,
	with the test harness, prompts, rubric and raw votes all open-source (CC-BY 4.0).
	Nothing private. Alongside it: RoleCall, a SillyTavern-grade roleplay studio
	in your browser, and PlotLight, our discovery floor.

	- The Studio (RoleCall): https://rolecallstudios.com
	- Discovery (PlotLight): https://plotlightstudios.com
	- PlotPoints results: https://plotlightstudios.com/plotpoints
	- Vote (multi-turn arena): https://plotlightstudios.com/plotpoints/multiturn
	- Methodology: https://plotlightstudios.com/plotpoints/methodology
	- Dataset (HF, CC-BY 4.0): https://huggingface.co/datasets/lazyweasel/roleplay-bench
	- Source (GitHub): https://github.com/LeviTheWeasel/rp-benchmark
	- Discord: https://discord.gg/aFvkTCDRtf
	- Reddit: https://www.reddit.com/r/RoleCallStudios/

	This Space is a static landing page. Use the Community tab to open a
	discussion or request a model for the benchmark.

	---
	title: RoleCall Studios
	emoji: 🎭
	colorFrom: purple
	colorTo: red
	sdk: static
	pinned: true
	short_description: Interactive fiction, discovery & the PlotPoints RP benchmark
	---

	# RoleCall Studios

	Home of PlotPoints, an open, reproducible benchmark for how well models
	roleplay: human blind votes plus an LLM-judge rubric plus adversarial probes,
	with the test harness, prompts, rubric and raw votes all open-source (CC-BY 4.0).
	Nothing private. Alongside it: RoleCall, a SillyTavern-grade roleplay studio
	in your browser, and PlotLight, our discovery floor.

	- The Studio (RoleCall): https://rolecallstudios.com
	- Discovery (PlotLight): https://plotlightstudios.com
	- PlotPoints results: https://plotlightstudios.com/plotpoints
	- Vote (multi-turn arena): https://plotlightstudios.com/plotpoints/multiturn
	- Methodology: https://plotlightstudios.com/plotpoints/methodology
	- Dataset (HF, CC-BY 4.0): https://huggingface.co/datasets/lazyweasel/roleplay-bench
	- Source (GitHub): https://github.com/LeviTheWeasel/rp-benchmark
	- Discord: https://discord.gg/aFvkTCDRtf
	- Reddit: https://www.reddit.com/r/RoleCallStudios/

	This Space is a static landing page. Use the Community tab to open a
	discussion or request a model for the benchmark.