Update README.md
Browse files
README.md
CHANGED
|
@@ -20,22 +20,6 @@ short_description: Reshaping businesses for the agentic era.
|
|
| 20 |
</div>
|
| 21 |
</header>
|
| 22 |
|
| 23 |
-
<p><strong>Foaster.ai</strong> is a French start-up focused on reshaping businesses for the agentic era. At <em>Foaster Labs</em>,
|
| 24 |
-
|
| 25 |
-
<div style="display:flex;align-items:center;gap:10px;margin:14px 0 22px;">
|
| 26 |
-
<a href="https://huggingface.co/spaces/Foaster/Werewolf_benchmark"
|
| 27 |
-
style="padding:10px 14px;border:1px solid #e5e7eb;border-radius:10px;text-decoration:none;">
|
| 28 |
-
π Full leaderboard β
|
| 29 |
-
</a>
|
| 30 |
-
</div>
|
| 31 |
-
|
| 32 |
-
<h3 style="margin:0 0 8px;">Results β Podium (role-conditioned Elo)</h3>
|
| 33 |
-
<p style="margin:0 0 10px;color:#64748b">ELO-W = wolf (manipulation power) Β· ELO-V = villager (manipulation resistance)</p>
|
| 34 |
-
|
| 35 |
-
<ul style="margin:0 0 8px 18px;">
|
| 36 |
-
<li>π₯ <strong>GPT-5 (OpenAI)</strong> β ELO 1492 (W 1508 Β· V 1476), win rate 96.7%, 60 matches</li>
|
| 37 |
-
<li>π₯ <strong>Gemini 2.5 Pro (Google)</strong> β ELO 1261 (W 1163 Β· V 1360), win rate 63.3%, 60 matches</li>
|
| 38 |
-
<li>π₯ <strong>Gemini 2.5 Flash (Google)</strong> β ELO 1188 (W 1103 Β· V 1273), win rate 51.7%, 60 matches</li>
|
| 39 |
-
</ul>
|
| 40 |
|
| 41 |
</div>
|
|
|
|
| 20 |
</div>
|
| 21 |
</header>
|
| 22 |
|
| 23 |
+
<p><strong>Foaster.ai</strong> is a French start-up focused on reshaping businesses for the agentic era. At <em>Foaster Labs</em>, we study LLMs and their personalities through original benchmarks and exploration frameworks.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
</div>
|