Spaces:

LLM-course
/

codenames

Running

App Files Files Community

codenames / README.md

nathanael-fijalkow

Update README.md

b3ea7f0 verified 1 day ago

preview code

raw

history blame contribute delete

2.35 kB

	---
	title: Codenames LLM Challenge
	emoji: 🕵️
	colorFrom: red
	colorTo: blue
	sdk: gradio
	sdk_version: 6.6.0
	python_version: '3.11'
	app_file: app.py
	pinned: true
	---

	# Codenames LLM Challenge

	A Python framework for students to implement guesser bots for Codenames. The LLM acts as spymaster using embeddings.

	## Game Rules

	Challenge Mode (Single Team):
	- Goal: Guess all RED words in minimum rounds
	- Board: 25 words total (9 RED, 8 BLUE, 8 ASSASSIN)
	- Each round: LLM spymaster gives a clue + number
	- Guesser makes up to (number + 1) guesses
	- Round ends if: BLUE word revealed, max guesses reached, or guesser stops
	- Game ends: WIN if all RED found, LOSE if ASSASSIN revealed

	## Setup

	```bash
	uv venv
	source .venv/bin/activate
	uv pip install -r requirements.txt
	```

	Dictionary: Fixed list of 420 Codenames words. Clues and board words must be from this dictionary (case-insensitive).

	Pre-build Embedding Cache (Recommended):

	```bash
	python -m codenames.cli init-cache
	```

	Downloads the embedding model and computes vectors for all 420 words (~30 seconds). Cached for reuse.

	## Test Your Guesser

	Create a Python file with a `guesser` function:

	```python
	# my_guesser.py
	def guesser(clue: str, board_state: list[str]) -> str \| None:
	"""
	Args:
	clue: The spymaster's one-word clue (from dictionary)
	board_state: List of unrevealed words on the board

	Returns:
	A word to guess from board_state, or None to stop the round
	"""
	# Your embedding-based or heuristic logic here
	return board_state[0] # Simple example: always guess first word
	```

	Run against LLM spymaster:

	```bash
	python -m codenames.cli challenge my_guesser.py --seed 42 --output log.json
	```

	Options:
	- `--seed`: Random seed for reproducible boards
	- `--model`: Embedding model (default: `sentence-transformers/all-MiniLM-L6-v2`)
	- `--max-rounds`: Maximum rounds before timeout (default: 10)
	- `--output`: Save JSON log with board state, clues, guesses, and result

	## Log Format

	The JSON output contains:
	- `seed`: Random seed used
	- `board_words`: All 25 words on the board
	- `board_roles`: Role for each word (RED/BLUE/ASSASSIN)
	- `rounds`: Array of rounds with clue, number, and guesses
	- `final_state`: Win/loss status and rounds taken

	Use this data to analyze performance or train ML models.