--- language: en tags: - wordle - pytorch - reinforcement-learning - supervised-learning - game-ai - nlp license: mit --- # 🟩 Wordle AI Solver Neural network models for solving Wordle puzzles. This repo contains two models — a supervised baseline and a reinforcement learning variant — both deployable via the [live app](https://wordle-solver-tan.vercel.app). --- ## Files | File | Description | |------|-------------| | `model_weights.pt` | Supervised model (WordleNet) | | `config.json` | Supervised model config | | `rl_model_weights.pt` | RL model (REINFORCE-filtered) | | `rl_config.json` | RL model config | | `answers.json` | 2,315 valid Wordle answers | | `allowed.json` | 12,972 valid guess words | --- ## Model Comparison | | 🧠 Supervised | 🤖 Reinforcement | |---|---|---| | **Training method** | CrossEntropy on entropy-optimal games | REINFORCE with elite game filtering | | **Win rate** | 100% | 98.2% | | **Avg guesses** | 3.46 | 3.75 | | **Opener** | CRANE | CRANE | | **Parameters** | ~13M | ~13M | --- ## Architecture Both models share the same encoder: ``` Input: 390-dim binary vector (26 letters × 5 positions × 3 states: grey/yellow/green) Hidden: Linear(390 → 512) → BatchNorm1d → ReLU → Dropout(0.3) Linear(512 → 512) → BatchNorm1d → ReLU → Dropout(0.3) Linear(512 → 256) → BatchNorm1d → ReLU Output: Linear(256 → 12972) logits over all 12,972 allowed guess words ``` Board encoding: ```python vec[letter_index * 15 + position * 3 + state] = 1.0 # letter_index: 0-25 (a-z) # position: 0-4 # state: 0=grey, 1=yellow, 2=green ``` --- ## Training ### Supervised Model Trained on ~10,000 (board_state, best_guess) pairs generated by an entropy-optimal solver that plays all 2,315 Wordle games. The solver picks the guess maximising expected information gain at each step: $$E[\text{Info}] = \sum_{p} P(p) \cdot \log_2\left(\frac{1}{P(p)}\right)$$ ### RL Model 1. **Warm start** from supervised weights 2. **Elite game collection** — greedy rollouts with constraint-filtered action masking, keeping only games solved in ≤3 guesses (~11% hit rate) 3. **REINFORCE training** — supervised loss on elite (state, action) pairs 4. **Benchmark** against all 2,315 answers using constraint-filtered suggestion logic The RL model learns purely from reward signal (win/lose, guesses used) without access to the entropy oracle used to train the supervised model. --- ## Inference The models are not used as raw classifiers — the backend combines model logits with constraint filtering: ```python # 1. Get top-20 model words logits = model(encode_board(history)) model_words = [ALLOWED[i] for i in logits.topk(20).indices] # 2. Filter to words consistent with all previous guesses possible = filter_words(ANSWERS, history) # 3. Score by entropy against remaining possible set candidates = model_words + possible best = max(candidates, key=lambda w: entropy_score(w, possible)) ``` This hybrid approach is why the supervised model achieves 100% — the neural net narrows the search, entropy scoring picks the optimal move. --- ## Usage ```python import torch import torch.nn as nn from huggingface_hub import hf_hub_download import json REPO_ID = "sato2ru/wordle-solver" config = json.load(open(hf_hub_download(REPO_ID, "config.json"))) ALLOWED = json.load(open(hf_hub_download(REPO_ID, "allowed.json"))) class WordleNet(nn.Module): def __init__(self): super().__init__() h = config["hidden"] self.net = nn.Sequential( nn.Linear(390, h), nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3), nn.Linear(h, h), nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3), nn.Linear(h, 256), nn.BatchNorm1d(256), nn.ReLU(), nn.Linear(256, 12972) ) def forward(self, x): return self.net(x) # Load supervised model model = WordleNet() model.load_state_dict( torch.load(hf_hub_download(REPO_ID, "model_weights.pt"), map_location="cpu") ) model.eval() ``` Or use the live API directly: ```bash curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=supervised" \ -H "Content-Type: application/json" \ -d '{"history": []}' curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=rl" \ -H "Content-Type: application/json" \ -d '{"history": []}' ``` --- ## Results ### Supervised — all 2,315 answers (greedy + entropy filter) ``` 1 guess : 1 2 guesses: 59 ████████████ 3 guesses: 1188 ██████████████████████████████████████████████ 4 guesses: 1010 ████████████████████████████████████████ 5 guesses: 56 ███████████ 6 guesses: 1 FAILED : 0 ✅ 100% win rate ``` ### RL — all 2,315 answers (greedy + entropy filter) ``` 1 guess : 1 2 guesses: 141 ████████████ 3 guesses: 810 ██████████████████████████████████████████████ 4 guesses: 893 ████████████████████████████████████████ 5 guesses: 343 ███████████ 6 guesses: 86 ████ FAILED : 41 ✅ 98.2% win rate ``` --- ## Links - **Live App:** [wordle-solver-tan.vercel.app](https://wordle-solver-tan.vercel.app) - **GitHub:** [github.com/Jeanwrld/wordle-solver](https://github.com/Jeanwrld/wordle-solver) - **Backend:** [github.com/Jeanwrld/wordle-api](https://github.com/Jeanwrld/wordle-api) - **Gradio Demo:** [huggingface.co/spaces/sato2ru/wordle](https://huggingface.co/spaces/sato2ru/wordle) --- ## License MIT