Spaces:
Running
Running
| title: Codenames LLM Challenge | |
| emoji: 🕵️ | |
| colorFrom: red | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: 6.6.0 | |
| python_version: '3.11' | |
| app_file: app.py | |
| pinned: true | |
| # Codenames LLM Challenge | |
| A Python framework for students to implement guesser bots for Codenames. The LLM acts as spymaster using embeddings. | |
| ## Game Rules | |
| **Challenge Mode (Single Team):** | |
| - Goal: Guess all RED words in minimum rounds | |
| - Board: 25 words total (9 RED, 8 BLUE, 8 ASSASSIN) | |
| - Each round: LLM spymaster gives a clue + number | |
| - Guesser makes up to (number + 1) guesses | |
| - Round ends if: BLUE word revealed, max guesses reached, or guesser stops | |
| - Game ends: WIN if all RED found, LOSE if ASSASSIN revealed | |
| ## Setup | |
| ```bash | |
| uv venv | |
| source .venv/bin/activate | |
| uv pip install -r requirements.txt | |
| ``` | |
| **Dictionary:** Fixed list of 420 Codenames words. Clues and board words must be from this dictionary (case-insensitive). | |
| **Pre-build Embedding Cache (Recommended):** | |
| ```bash | |
| python -m codenames.cli init-cache | |
| ``` | |
| Downloads the embedding model and computes vectors for all 420 words (~30 seconds). Cached for reuse. | |
| ## Test Your Guesser | |
| Create a Python file with a `guesser` function: | |
| ```python | |
| # my_guesser.py | |
| def guesser(clue: str, board_state: list[str]) -> str | None: | |
| """ | |
| Args: | |
| clue: The spymaster's one-word clue (from dictionary) | |
| board_state: List of unrevealed words on the board | |
| Returns: | |
| A word to guess from board_state, or None to stop the round | |
| """ | |
| # Your embedding-based or heuristic logic here | |
| return board_state[0] # Simple example: always guess first word | |
| ``` | |
| Run against LLM spymaster: | |
| ```bash | |
| python -m codenames.cli challenge my_guesser.py --seed 42 --output log.json | |
| ``` | |
| **Options:** | |
| - `--seed`: Random seed for reproducible boards | |
| - `--model`: Embedding model (default: `sentence-transformers/all-MiniLM-L6-v2`) | |
| - `--max-rounds`: Maximum rounds before timeout (default: 10) | |
| - `--output`: Save JSON log with board state, clues, guesses, and result | |
| ## Log Format | |
| The JSON output contains: | |
| - `seed`: Random seed used | |
| - `board_words`: All 25 words on the board | |
| - `board_roles`: Role for each word (RED/BLUE/ASSASSIN) | |
| - `rounds`: Array of rounds with clue, number, and guesses | |
| - `final_state`: Win/loss status and rounds taken | |
| Use this data to analyze performance or train ML models. |