Instructions to use Hanks1234/battleship-ppo-dagger with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- stable-baselines3
How to use Hanks1234/battleship-ppo-dagger with stable-baselines3:
from huggingface_sb3 import load_from_hub checkpoint = load_from_hub( repo_id="Hanks1234/battleship-ppo-dagger", filename="{MODEL FILENAME}.zip", ) - Notebooks
- Google Colab
- Kaggle
Battleship PPO Agent โ Hanks1234/battleship-ppo-dagger
A MaskablePPO agent trained on a 10x20 Battleship board with custom T-shaped and Z-shaped ships using sb3-contrib.
Environment
- Board: 10 columns x 20 rows
- Ships: 10 ships including T-shaped Battleships and Z-shaped Carriers
- Observation: 5-channel binary image (5, 20, 10)
- Action: Discrete(200) with action masking (no repeat shots)
Training Config
| Parameter | Value |
|---|---|
method |
DAgger (Dataset Aggregation) |
base_model |
BC-pretrained MaskablePPO |
observation_channels |
15 |
board_size |
10x20 |
expert |
Monte Carlo solver (1000 samples) |
disagree_only |
True |
confidence_threshold |
0.3 |
freeze_cnn |
True |
Evaluation Results
| Metric | Value |
|---|---|
| mean_shots_500_games | 100.90 |
| verified_games | 500 |
| seed | 20000 |
| notes | 15-channel DAgger; first model to break 100-shot barrier |
Usage
from training.hub import load_model_from_hub
model = load_model_from_hub("Hanks1234/battleship-ppo-dagger")
- Downloads last month
- 1