|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# ConnectZero-Nakalipithecus |
|
|
|
|
|
An AlphaZero-based Reinforcement Learning agent for Connect 4 game. |
|
|
|
|
|
**Architecture:** ResNet (5 Residual Blocks) + Dual Head (Policy & Value). |
|
|
|
|
|
**Framework:** PyTorch. |
|
|
|
|
|
**Training Platform:** Kaggle T4 GPU. |
|
|
|
|
|
**Author: Chakrabhuana Vishnu Deva.** |
|
|
|
|
|
# Training result |
|
|
|
|
|
``` |
|
|
Total Parameter of the Model: 1,497,742 |
|
|
Starting Training for 5 Iterations... |
|
|
|
|
|
--- Iteration 1 --- |
|
|
Self-Playing 100 games... |
|
|
Data Collected: 1359 samples |
|
|
Avg Loss: 2.9339 |
|
|
|
|
|
--- Iteration 2 --- |
|
|
Self-Playing 100 games... |
|
|
Data Collected: 1644 samples |
|
|
Avg Loss: 2.6747 |
|
|
|
|
|
--- Iteration 3 --- |
|
|
Self-Playing 100 games... |
|
|
Data Collected: 1739 samples |
|
|
Avg Loss: 2.4139 |
|
|
|
|
|
--- Iteration 4 --- |
|
|
Self-Playing 100 games... |
|
|
Data Collected: 1678 samples |
|
|
Avg Loss: 2.3377 |
|
|
|
|
|
--- Iteration 5 --- |
|
|
Self-Playing 100 games... |
|
|
Data Collected: 2370 samples |
|
|
Avg Loss: 2.1712 |
|
|
Model Saved! |
|
|
``` |