---
license: mit
tags:
  - reinforcement-learning
  - poker
  - stable-baselines3
  - ppo
  - texas-holdem
library_name: stable-baselines3
---

# PokerForge PPO Poker Bots

This repository contains runtime artifacts for PokerForge, a full-stack AI poker
platform built around heads-up No-Limit Texas Hold'em abstractions.

## Files

- `models/medium/ppo_medium_final.zip` - PPO medium bot trained for 1M timesteps vs easy.
- `models/hard/ppo_hard_final.zip` - PPO hard bot trained for 5M timesteps vs frozen medium.
- `models/*/best_model.zip` - best checkpoints from training/evaluation callbacks.
- `reports/evaluation_report.*` - latest reproducible bot-vs-bot evaluation report.
- `reports/representative_hands.json` - replay-ready sample hand logs for the frontend dashboard.

## Runtime Contract

- Framework: Stable-Baselines3 PPO
- Observation space: `Box(18,)`
- Action space: `Discrete(3)` where `0=fold`, `1=check/call`, `2=raise`
- Expected local paths inside PokerForge:
  - `backend/data/models/medium/ppo_medium_final.zip`
  - `backend/data/models/hard/ppo_hard_final.zip`

## Evaluation Summary

The latest evaluation report is included under `reports/`. The current honest
finding is that medium and hard both beat easy, while hard only shows a marginal,
statistically weak edge over medium. This is attributed mainly to the limited
3-action abstraction creating a ceiling on behavioral differentiation.

## Reproduce In PokerForge

```bash
cd backend
python tools/download_models.py --repo-id Rushisagar221/pokerforge-bots --if-missing
python server.py
```

## Manifest

```json
{
  "repo_id": "Rushisagar221/pokerforge-bots",
  "generated_at": "2026-04-23T14:08:03.040902",
  "artifacts": [
    {
      "path": "models/medium/ppo_medium_final.zip",
      "bytes": 162131,
      "sha256": "b8ed8a7217de2bc790af71a0dbdc6a5a9fd695fcf541351bb965549d3c20c126"
    },
    {
      "path": "models/medium/best_model.zip",
      "bytes": 162116,
      "sha256": "31d26001f967b7d221af016ec1a4c5b1a33f32b71630cb2eea3bf9c8a2e59956"
    },
    {
      "path": "models/hard/ppo_hard_final.zip",
      "bytes": 165087,
      "sha256": "ac3b23fd8188713cd25bcbd1585cfc213d1a05b6254c56ba59ee7119de5896e1"
    },
    {
      "path": "models/hard/best_model.zip",
      "bytes": 165087,
      "sha256": "ac3b23fd8188713cd25bcbd1585cfc213d1a05b6254c56ba59ee7119de5896e1"
    },
    {
      "path": "reports/evaluation_report.json",
      "bytes": 51143,
      "sha256": "1ce452e2e57e67965f13337cb12a736cf658467e3db70ea20b52eaeddb67532a"
    },
    {
      "path": "reports/evaluation_report.md",
      "bytes": 2486,
      "sha256": "2ed28899b86090b97bdacbae1273c4366202344e42ab8b93013cdc260db378bb"
    },
    {
      "path": "reports/representative_hands.json",
      "bytes": 64864,
      "sha256": "33054b06fbfe819dbfb98faafa32534ff149bbd254f9250fb405786fbd2ecaf3"
    }
  ]
}
```