--- license: mit tags: - escampe - board-game - game-ai - dual-perspective - resnet - evaluation-function pipeline_tag: tabular-regression datasets: - Bluefir/escampe-dataset --- # BandDPER (Escampe Evaluation Network) ## Model Details **BandDPER** (Dual-Perspective Evaluation ResNet) is a highly specialized PyTorch neural network designed to evaluate board states for the game of Escampe. It serves as a fast and accurate heuristic function for a Java-based Alpha-Beta minimax engine where it outputs $-1$ for a losing position and $+1$ for a winning one. Because the model is deployed inside a lightweight Java client that computes inference without a dedicated tensor library, BandDPER is intentionally designed to be small, efficient, and mathematically simple to port. Notably, all `BatchNorm` parameters are fused into the adjacent `Conv2D` and `Linear` layers during export, completely eliminating the need for runtime batch normalization math in Java.   ### Model Architecture The architecture processes the 6x6 Escampe board from a "Dual-Perspective" to retain spatial symmetry without requiring the network to learn perspective inversion: 1. **Shared Spatial Encoder**: Both the current player's perspective tensor (`x_me`) and the opponent's perspective tensor (`x_opp`) are passed through a shared Convolutional feature extractor. - A standard 3x3 Conv2D layer extracts local patterns. - A Dilated 3x3 Conv2D layer (dilation=2) increases the receptive field to 5x5, allowing the network to easily "see" across the 6x6 board without requiring a deep, expensive ResNet. - The output is flattened and projected into a 128-dimensional embedding using a `ClippedReLU` (`clamp(0, 1)`) to ensure numerical stability. 2. **Feature Fusion**: The two 128-d perspective embeddings are concatenated (yielding 256 dims) along with 2 highly predictive scalar signals: the number of escape squares available to the White Unicorn and the Black Unicorn. 3. **Residual Trunk**: The 258-dimensional vector is processed by a stack of standard Multilayer Perceptron (MLP) Residual Blocks. 4. **Output Head**: The final layers reduce the vector to a single `[-1.0, 1.0]` scalar evaluation using a `tanh` activation. Additionally, a direct "forced-pass" shortcut bypasses the entire residual trunk to severely penalize the evaluation if the current player has no legal moves. | Stage | Detail | |-------|--------| | Input | `[16, 6, 6]` tensor per perspective (16 channels encoding piece positions, band masks, movement constraints, occupancy maps, unicorn-relative attacker map) | | Shared Spatial Encoder | `Conv(16→32) -> BN -> ReLU -> Conv(32→64, dil=2) -> BN -> ReLU -> Flatten -> Linear(2304→128) -> BN -> ClippedReLU[0,1]` | | Siamese Fusion | Both perspectives encoded with shared weights, concatenated -> `[B, 256]` | | Scalar Injection | Unicorn escape counts appended -> `[B, 258]` | | Residual Trunk | 3 × ResBlock(258) with ClippedReLU | | Output Head | `Linear(258→64) -> ReLU -> Linear(64→1) + w_forced_pass * forced_pass -> tanh` | | Parameters | ~730K (float32, ~2.9 MB) |     ## Uses ### Direct Use This model is intended to be used as a static evaluation function for an Escampe minimax agent. It takes in a tensor representation of a board state and instantly predicts the expected heuristic outcome from the current player's perspective. ```python import torch from huggingface_hub import hf_hub_download # Download and load pth_path = hf_hub_download(repo_id="Bluefir/BandDPER", filename="banddper.pth") model = BandDPER(num_res_blocks={NUM_RES_BLOCKS}) model.load_state_dict(torch.load(pth_path, map_location="cpu")) model.eval() ``` ### Downstream Use The model weights are meant to be exported and fused using `export.py`, which folds batch normalization layers and exports raw weight arrays. These raw weights are then loaded into the `DataExport.java` or `ClientJeu.java` engine to evaluate nodes at the leaf of an Alpha-Beta search tree. ## Training Details | Hyperparameter | Value | |----------------|-------| | Epochs | 40 | | Batch size | {BATCH_SIZE} | | Optimiser | AdamW (lr=1e-3, wd=1e-4) with Gradient Clipping | | LR schedule | CosineAnnealingWarmRestarts (T₀=40) | | Loss | MSE | | Dataset | [{Escampe Dataset}](https://huggingface.co/datasets/{Bluefir/escampe-dataset}) ({all}) | | Best val loss | {best_val:.5f} | | Data Representation | | Processed bitboards into a spatial tensor. Channels encode piece locations, strict band masks, departure constraints, forced-pass conditions, and a custom unicorn-relative opponent attacker map to assist the convolution layers. | ### Files | File | Description | |------|-------------| | `banddper.pth` | Full PyTorch state dict for resuming training or Python inference | | `banddper_weights.json` | BatchNorm-folded weights for Java inference (no BN at runtime) | | `training_curves.png` | Train / val MSE loss curves | ## Technical Specifications - **Framework**: PyTorch - **Input Shape**: Two `[16, 6, 6]` spatial tensors + 3 scalars. - **Output Shape**: `[1]` scalar bounded between `[-1.0, 1.0]`. ## Out-of-Scope Use This model is tightly coupled to the rules, 6x6 dimensions, and band mechanics of Escampe. It cannot be used or transferred to evaluate other board games like Chess or Checkers without completely replacing the input tensor builder and retraining from scratch.