Balatro OCR
Fine-tuned PaddleOCR recognition model for extracting game state text from downscaled Balatro gameplay (360p).
The model is trained on UI text cropped from frames downscaled from 1920Γ1080 gameplay. It is intended for pipelines that reconstruct structured game state from video, typically for imitation learning or behavior cloning.
The repository includes:
- a trained PaddleOCR inference model
- predefined UI bounding boxes (
text_boxes.json) - a reference inference script
- an example gameplay frame
Repository
balatro-ocr
βββ example.png
βββ inference
β βββ inference.pdiparams
β βββ inference.pdmodel
β βββ inference.yml
βββ inference.py
βββ text_boxes.json
βββ README.md
Example Frame
Example 360p gameplay frame used for OCR.
How It Works
- gameplay frames are downscaled to 360p
text_boxes.jsondefines fixed UI regions- each region is cropped and passed through the OCR model
- predictions reconstruct the game state
video frame
β
downscale to 360p
β
crop regions (text_boxes.json)
β
Balatro OCR
β
structured game state
The bounding boxes correspond to important UI elements such as:
- reroll price
- round score
- dollars
- hand size
- pack type
Installation
pip install paddlepaddle paddleocr opencv-python numpy
The inference script uses internal PaddleOCR utilities, so clone the PaddleOCR repository:
git clone https://github.com/PaddlePaddle/PaddleOCR
Run the script from inside the PaddleOCR directory or ensure it is on PYTHONPATH.
Usage
Run OCR on the example frame:
python inference.py
