--- title: Codenames LLM Challenge emoji: 🕵️ colorFrom: red colorTo: blue sdk: gradio sdk_version: 6.6.0 python_version: '3.11' app_file: app.py pinned: true --- # Codenames LLM Challenge A Python framework for students to implement guesser bots for Codenames. The LLM acts as spymaster using embeddings. ## Game Rules **Challenge Mode (Single Team):** - Goal: Guess all RED words in minimum rounds - Board: 25 words total (9 RED, 8 BLUE, 8 ASSASSIN) - Each round: LLM spymaster gives a clue + number - Guesser makes up to (number + 1) guesses - Round ends if: BLUE word revealed, max guesses reached, or guesser stops - Game ends: WIN if all RED found, LOSE if ASSASSIN revealed ## Setup ```bash uv venv source .venv/bin/activate uv pip install -r requirements.txt ``` **Dictionary:** Fixed list of 420 Codenames words. Clues and board words must be from this dictionary (case-insensitive). **Pre-build Embedding Cache (Recommended):** ```bash python -m codenames.cli init-cache ``` Downloads the embedding model and computes vectors for all 420 words (~30 seconds). Cached for reuse. ## Test Your Guesser Create a Python file with a `guesser` function: ```python # my_guesser.py def guesser(clue: str, board_state: list[str]) -> str | None: """ Args: clue: The spymaster's one-word clue (from dictionary) board_state: List of unrevealed words on the board Returns: A word to guess from board_state, or None to stop the round """ # Your embedding-based or heuristic logic here return board_state[0] # Simple example: always guess first word ``` Run against LLM spymaster: ```bash python -m codenames.cli challenge my_guesser.py --seed 42 --output log.json ``` **Options:** - `--seed`: Random seed for reproducible boards - `--model`: Embedding model (default: `sentence-transformers/all-MiniLM-L6-v2`) - `--max-rounds`: Maximum rounds before timeout (default: 10) - `--output`: Save JSON log with board state, clues, guesses, and result ## Log Format The JSON output contains: - `seed`: Random seed used - `board_words`: All 25 words on the board - `board_roles`: Role for each word (RED/BLUE/ASSASSIN) - `rounds`: Array of rounds with clue, number, and guesses - `final_state`: Win/loss status and rounds taken Use this data to analyze performance or train ML models.