Rushisagar221
/

pokerforge-bots

Reinforcement Learning

stable-baselines3

Model card Files Files and versions

pokerforge-bots / README.md

Rushisagar221's picture

Upload PokerForge PPO bot artifacts

87a26b3 verified about 1 month ago

|

history blame contribute delete

2.88 kB

	---
	license: mit
	tags:
	- reinforcement-learning
	- poker
	- stable-baselines3
	- ppo
	- texas-holdem
	library_name: stable-baselines3
	---

	# PokerForge PPO Poker Bots

	This repository contains runtime artifacts for PokerForge, a full-stack AI poker
	platform built around heads-up No-Limit Texas Hold'em abstractions.

	## Files

	- `models/medium/ppo_medium_final.zip` - PPO medium bot trained for 1M timesteps vs easy.
	- `models/hard/ppo_hard_final.zip` - PPO hard bot trained for 5M timesteps vs frozen medium.
	- `models/*/best_model.zip` - best checkpoints from training/evaluation callbacks.
	- `reports/evaluation_report.*` - latest reproducible bot-vs-bot evaluation report.
	- `reports/representative_hands.json` - replay-ready sample hand logs for the frontend dashboard.

	## Runtime Contract

	- Framework: Stable-Baselines3 PPO
	- Observation space: `Box(18,)`
	- Action space: `Discrete(3)` where `0=fold`, `1=check/call`, `2=raise`
	- Expected local paths inside PokerForge:
	- `backend/data/models/medium/ppo_medium_final.zip`
	- `backend/data/models/hard/ppo_hard_final.zip`

	## Evaluation Summary

	The latest evaluation report is included under `reports/`. The current honest
	finding is that medium and hard both beat easy, while hard only shows a marginal,
	statistically weak edge over medium. This is attributed mainly to the limited
	3-action abstraction creating a ceiling on behavioral differentiation.

	## Reproduce In PokerForge

	```bash
	cd backend
	python tools/download_models.py --repo-id Rushisagar221/pokerforge-bots --if-missing
	python server.py
	```

	## Manifest

	```json
	{
	"repo_id": "Rushisagar221/pokerforge-bots",
	"generated_at": "2026-04-23T14:08:03.040902",
	"artifacts": [
	{
	"path": "models/medium/ppo_medium_final.zip",
	"bytes": 162131,
	"sha256": "b8ed8a7217de2bc790af71a0dbdc6a5a9fd695fcf541351bb965549d3c20c126"
	},
	{
	"path": "models/medium/best_model.zip",
	"bytes": 162116,
	"sha256": "31d26001f967b7d221af016ec1a4c5b1a33f32b71630cb2eea3bf9c8a2e59956"
	},
	{
	"path": "models/hard/ppo_hard_final.zip",
	"bytes": 165087,
	"sha256": "ac3b23fd8188713cd25bcbd1585cfc213d1a05b6254c56ba59ee7119de5896e1"
	},
	{
	"path": "models/hard/best_model.zip",
	"bytes": 165087,
	"sha256": "ac3b23fd8188713cd25bcbd1585cfc213d1a05b6254c56ba59ee7119de5896e1"
	},
	{
	"path": "reports/evaluation_report.json",
	"bytes": 51143,
	"sha256": "1ce452e2e57e67965f13337cb12a736cf658467e3db70ea20b52eaeddb67532a"
	},
	{
	"path": "reports/evaluation_report.md",
	"bytes": 2486,
	"sha256": "2ed28899b86090b97bdacbae1273c4366202344e42ab8b93013cdc260db378bb"
	},
	{
	"path": "reports/representative_hands.json",
	"bytes": 64864,
	"sha256": "33054b06fbfe819dbfb98faafa32534ff149bbd254f9250fb405786fbd2ecaf3"
	}
	]
	}
	```