alex17cmbs commited on
Commit
7c00924
Β·
verified Β·
1 Parent(s): d051404

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -17
README.md CHANGED
@@ -20,22 +20,6 @@ short_description: Reshaping businesses for the agentic era.
20
  </div>
21
  </header>
22
 
23
- <p><strong>Foaster.ai</strong> is a French start-up focused on reshaping businesses for the agentic era. At <em>Foaster Labs</em>, our Werewolf Benchmark studies how LLMs behave under social pressure: leadership, bluffing, and resistance to manipulation.</p>
24
-
25
- <div style="display:flex;align-items:center;gap:10px;margin:14px 0 22px;">
26
- <a href="https://huggingface.co/spaces/Foaster/Werewolf_benchmark"
27
- style="padding:10px 14px;border:1px solid #e5e7eb;border-radius:10px;text-decoration:none;">
28
- πŸ”— Full leaderboard β†’
29
- </a>
30
- </div>
31
-
32
- <h3 style="margin:0 0 8px;">Results β€” Podium (role-conditioned Elo)</h3>
33
- <p style="margin:0 0 10px;color:#64748b">ELO-W = wolf (manipulation power) Β· ELO-V = villager (manipulation resistance)</p>
34
-
35
- <ul style="margin:0 0 8px 18px;">
36
- <li>πŸ₯‡ <strong>GPT-5 (OpenAI)</strong> β€” ELO 1492 (W 1508 Β· V 1476), win rate 96.7%, 60 matches</li>
37
- <li>πŸ₯ˆ <strong>Gemini 2.5 Pro (Google)</strong> β€” ELO 1261 (W 1163 Β· V 1360), win rate 63.3%, 60 matches</li>
38
- <li>πŸ₯‰ <strong>Gemini 2.5 Flash (Google)</strong> β€” ELO 1188 (W 1103 Β· V 1273), win rate 51.7%, 60 matches</li>
39
- </ul>
40
 
41
  </div>
 
20
  </div>
21
  </header>
22
 
23
+ <p><strong>Foaster.ai</strong> is a French start-up focused on reshaping businesses for the agentic era. At <em>Foaster Labs</em>, we study LLMs and their personalities through original benchmarks and exploration frameworks.</p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  </div>