--- license: apache-2.0 language: - en --- # ConnectZero-Nakalipithecus An AlphaZero-based Reinforcement Learning agent for Connect 4 game. **Architecture:** ResNet (5 Residual Blocks) + Dual Head (Policy & Value). **Framework:** PyTorch. **Training Platform:** Kaggle T4 GPU. **Author: Chakrabhuana Vishnu Deva.** # Training result ``` Total Parameter of the Model: 1,497,742 Starting Training for 5 Iterations... --- Iteration 1 --- Self-Playing 100 games... Data Collected: 1359 samples Avg Loss: 2.9339 --- Iteration 2 --- Self-Playing 100 games... Data Collected: 1644 samples Avg Loss: 2.6747 --- Iteration 3 --- Self-Playing 100 games... Data Collected: 1739 samples Avg Loss: 2.4139 --- Iteration 4 --- Self-Playing 100 games... Data Collected: 1678 samples Avg Loss: 2.3377 --- Iteration 5 --- Self-Playing 100 games... Data Collected: 2370 samples Avg Loss: 2.1712 Model Saved! ```