GChess

File size: 5,420 Bytes

---
license: cc-by-nc-4.0
tags:
- code
- chess
- game
- CNN
- ResNet
---


# **GChess: A Deep Residual Network for Chess**

## Model Description
The **GChess** model is a deep neural network designed for the game of chess, inspired by the **AlphaZero** architecture. It uses a single network to perform both move prediction (Policy) and position evaluation (Value).

This release is a **proof-of-concept** version. The model's current estimated playing strength is **~1300 Elo**, placing it at a beginner to intermediate level. It demonstrates a robust foundation for an AlphaZero-style chess AI.

---

## Architecture Details

GChess is built on a **Deep Residual Network (ResNet)**, which is highly effective for processing the spatial features of an 8x8 board.

### **Core Network (Torso)**
* **Architecture Type:** Deep Residual Network (ResNet).
* **Residual Blocks:** **20** blocks, ensuring deep, hierarchical feature learning.
* **Filter Count:** **512** convolutional filters (channels) in its main layers for high feature complexity.

### **Input Representation**
The network accepts a multi-plane tensor encoding the board state and history:
* **Input Channels:** **128** input channels.
* **Data Included:** Piece locations, player to move, castling rights, and **8-ply history** to handle repetition and context.

### **Dual Output Heads**
The shared ResNet torso branches into two specialized output heads:

| Head | Function | Output Format |
| :--- | :--- | :--- |
| **Policy Head (p\_logits)** | **Move Prediction** | Logits over **4672** possible moves/actions. |
| **Value Head (v)** | **Position Evaluation** | Single scalar value in [-1.0, +1.0]. |

| Value Interpretation | Score |
| :--- | :--- |
| **White Winning** | Close to +1.0 |
| **Black Winning** | Close to -1.0 |
| **Equal Position** | Close to 0.0 |

---

## Training Summary

The model was trained on a small dataset of **50,000 high-quality PGN games** across **25 epochs**.

### **Convergence Analysis**
The training process was stable and highly efficient, utilizing an aggressive learning rate strategy.

* **Training Loss Curve:**
    ![Training Loss](training_loss.png)
    The loss shows a rapid initial drop, signifying quick learning of fundamental concepts, followed by a smooth convergence.

* **Detailed Loss Convergence:**
    ![Detailed Loss convergences](detailed_training_loss.png)
    A detailed view reveals short-term oscillations, which are expected with dynamic learning rate scheduling but confirm a consistently downward trend towards a low, stable loss.

* **Accuracy Evaluations:**
    ![Accuracy Evaluations](accuracy_evals.png)
    Both Top 1 and Top 5 Accuracy showed clear, consistent upward trends, confirming that the network successfully learned to predict expert moves without overfitting to the limited data. The high Top 5 Accuracy indicates the model reliably generates a strong list of candidate moves.

---

## Conclusion and Future Outlook

* **Current Performance:** The model achieved an estimated **1300 Elo**. While this is an entry-level performance, it's a strong result considering the resource constraints.
* **Strong Foundation:** The architecture is structurally sound, and the training process demonstrated effective learning.
* **Future Potential:** The established architecture is well-suited for scaling. With a significantly larger, more diverse dataset (e.g., millions of games) and extended training, this model has the foundation to reach expert and master Elo levels.

---

## Usage

To use the GChess model for inference, you must convert a `chess.Board` object and its history into the required **128-channel input tensor**.

```python
import chess
import torch
import torch.nn.functional as F

# NOTE: The 'model' object must be loaded from a checkpoint, and 
# 'board_to_tensor' function must be implemented separately 
# to generate the 128-channel input.
# DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define Input State (FEN)
# Example: Initial position
fen = "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"
board = chess.Board(fen)

# --- Preprocess Input (Requires custom function) ---
# input_tensor = board_to_tensor(board, history_depth=8).unsqueeze(0).to(DEVICE)
# Placeholder tensor for execution:
input_tensor = torch.randn(1, 128, 8, 8) 
model = torch.nn.Module() # Placeholder model for execution
model.eval() # Set model to evaluation mode

# Run Inference
with torch.no_grad():
    # policy_logits is a tensor of size 4672, value_output is a scalar tensor
    # policy_logits, value_output = model(input_tensor)
    
    # Placeholder outputs for demonstration:
    policy_logits = torch.randn(1, 4672) 
    value_output = torch.tensor([[0.25]])

# Post-process Output
policy_probabilities = F.softmax(policy_logits, dim=1).squeeze(0)

# Find the move with the highest predicted probability
best_action_index = torch.argmax(policy_probabilities).item()
best_probability = policy_probabilities[best_action_index].item()

# Extract the value prediction
expected_value = value_output.item()

# Print Results
print(f"FEN: {fen}")
print(f"--- Model Prediction ---")
print(f"Predicted Probability of Top Move: {best_probability:.4f}")
print(f"Position Evaluation (Value): {expected_value:.4f}")
print("Interpretation: Value close to +1.0 means White is winning, -1.0 means Black is winning.")
```

Devlopper: PENEAUX Benjamin