---
language: en
tags:
  - wordle
  - pytorch
  - reinforcement-learning
  - supervised-learning
  - game-ai
  - nlp
license: mit
---

# 🟩 Wordle AI Solver

Neural network models for solving Wordle puzzles. This repo contains two models — a supervised baseline and a reinforcement learning variant — both deployable via the [live app](https://wordle-solver-tan.vercel.app).

---

## Files

| File | Description |
|------|-------------|
| `model_weights.pt` | Supervised model (WordleNet) |
| `config.json` | Supervised model config |
| `rl_model_weights.pt` | RL model (REINFORCE-filtered) |
| `rl_config.json` | RL model config |
| `answers.json` | 2,315 valid Wordle answers |
| `allowed.json` | 12,972 valid guess words |

---

## Model Comparison

| | 🧠 Supervised | 🤖 Reinforcement |
|---|---|---|
| **Training method** | CrossEntropy on entropy-optimal games | REINFORCE with elite game filtering |
| **Win rate** | 100% | 98.2% |
| **Avg guesses** | 3.46 | 3.75 |
| **Opener** | CRANE | CRANE |
| **Parameters** | ~13M | ~13M |

---

## Architecture

Both models share the same encoder:

```
Input:  390-dim binary vector
        (26 letters × 5 positions × 3 states: grey/yellow/green)

Hidden: Linear(390 → 512) → BatchNorm1d → ReLU → Dropout(0.3)
        Linear(512 → 512) → BatchNorm1d → ReLU → Dropout(0.3)
        Linear(512 → 256) → BatchNorm1d → ReLU

Output: Linear(256 → 12972)
        logits over all 12,972 allowed guess words
```

Board encoding:
```python
vec[letter_index * 15 + position * 3 + state] = 1.0
# letter_index: 0-25 (a-z)
# position:     0-4
# state:        0=grey, 1=yellow, 2=green
```

---

## Training

### Supervised Model
Trained on ~10,000 (board_state, best_guess) pairs generated by an entropy-optimal solver that plays all 2,315 Wordle games. The solver picks the guess maximising expected information gain at each step:

$$E[\text{Info}] = \sum_{p} P(p) \cdot \log_2\left(\frac{1}{P(p)}\right)$$

### RL Model
1. **Warm start** from supervised weights
2. **Elite game collection** — greedy rollouts with constraint-filtered action masking, keeping only games solved in ≤3 guesses (~11% hit rate)
3. **REINFORCE training** — supervised loss on elite (state, action) pairs
4. **Benchmark** against all 2,315 answers using constraint-filtered suggestion logic

The RL model learns purely from reward signal (win/lose, guesses used) without access to the entropy oracle used to train the supervised model.

---

## Inference

The models are not used as raw classifiers — the backend combines model logits with constraint filtering:

```python
# 1. Get top-20 model words
logits = model(encode_board(history))
model_words = [ALLOWED[i] for i in logits.topk(20).indices]

# 2. Filter to words consistent with all previous guesses
possible = filter_words(ANSWERS, history)

# 3. Score by entropy against remaining possible set
candidates = model_words + possible
best = max(candidates, key=lambda w: entropy_score(w, possible))
```

This hybrid approach is why the supervised model achieves 100% — the neural net narrows the search, entropy scoring picks the optimal move.

---

## Usage

```python
import torch
import torch.nn as nn
from huggingface_hub import hf_hub_download
import json

REPO_ID = "sato2ru/wordle-solver"

config  = json.load(open(hf_hub_download(REPO_ID, "config.json")))
ALLOWED = json.load(open(hf_hub_download(REPO_ID, "allowed.json")))

class WordleNet(nn.Module):
    def __init__(self):
        super().__init__()
        h = config["hidden"]
        self.net = nn.Sequential(
            nn.Linear(390, h), nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(h, h),   nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
            nn.Linear(h, 256), nn.BatchNorm1d(256), nn.ReLU(),
            nn.Linear(256, 12972)
        )
    def forward(self, x): return self.net(x)

# Load supervised model
model = WordleNet()
model.load_state_dict(
    torch.load(hf_hub_download(REPO_ID, "model_weights.pt"), map_location="cpu")
)
model.eval()
```

Or use the live API directly:
```bash
curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=supervised" \
  -H "Content-Type: application/json" \
  -d '{"history": []}'

curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=rl" \
  -H "Content-Type: application/json" \
  -d '{"history": []}'
```

---

## Results

### Supervised — all 2,315 answers (greedy + entropy filter)
```
1 guess :    1
2 guesses:   59  ████████████
3 guesses: 1188  ██████████████████████████████████████████████
4 guesses: 1010  ████████████████████████████████████████
5 guesses:   56  ███████████
6 guesses:    1
FAILED   :    0  ✅ 100% win rate
```

### RL — all 2,315 answers (greedy + entropy filter)
```
1 guess :    1
2 guesses:  141  ████████████
3 guesses:  810  ██████████████████████████████████████████████
4 guesses:  893  ████████████████████████████████████████
5 guesses:  343  ███████████
6 guesses:   86  ████
FAILED   :   41  ✅ 98.2% win rate
```

---

## Links

- **Live App:** [wordle-solver-tan.vercel.app](https://wordle-solver-tan.vercel.app)
- **GitHub:** [github.com/Jeanwrld/wordle-solver](https://github.com/Jeanwrld/wordle-solver)
- **Backend:** [github.com/Jeanwrld/wordle-api](https://github.com/Jeanwrld/wordle-api)
- **Gradio Demo:** [huggingface.co/spaces/sato2ru/wordle](https://huggingface.co/spaces/sato2ru/wordle)

---

## License

MIT