Balatro OCR

Fine-tuned PaddleOCR recognition model for extracting game state text from downscaled Balatro gameplay (360p).

The model is trained on UI text cropped from frames downscaled from 1920×1080 gameplay. It is intended for pipelines that reconstruct structured game state from video, typically for imitation learning or behavior cloning.

The repository includes:

a trained PaddleOCR inference model
predefined UI bounding boxes (text_boxes.json)
a reference inference script
an example gameplay frame

Repository

balatro-ocr
├── example.png
├── inference
│   ├── inference.pdiparams
│   ├── inference.pdmodel
│   └── inference.yml
├── inference.py
├── text_boxes.json
└── README.md

Example Frame

Example 360p gameplay frame used for OCR.

How It Works

gameplay frames are downscaled to 360p
text_boxes.json defines fixed UI regions
each region is cropped and passed through the OCR model
predictions reconstruct the game state

video frame
   ↓
downscale to 360p
   ↓
crop regions (text_boxes.json)
   ↓
Balatro OCR
   ↓
structured game state

The bounding boxes correspond to important UI elements such as:

reroll price
round score
dollars
hand size
pack type

Installation

pip install paddlepaddle paddleocr opencv-python numpy

The inference script uses internal PaddleOCR utilities, so clone the PaddleOCR repository:

git clone https://github.com/PaddlePaddle/PaddleOCR

Run the script from inside the PaddleOCR directory or ensure it is on PYTHONPATH.

Usage

Run OCR on the example frame:

python inference.py

Downloads last month: -; Downloads are not tracked for this model. How to track