File size: 1,705 Bytes
47092e6 1d53a18 47092e6 1d53a18 47092e6 1d53a18 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ---
license: mit
datasets:
- reeeemo/jigsaw_puzzle
pipeline_tag: reinforcement-learning
library_name: stable-baselines3
---
Reinforcement-Learning model *(puzzler-v0)* that utilizes MaskablePPO to guide assembly of a jigsaw puzzle (puzzle pieces with irregular, convex boundaries).
Code to train/create this custom environment can be found in the "How Puzzling!" [github repo](https://github.com/reeeeemo/how-puzzling).
**Initialization Parameters:**
```
def __init__(self, images, seg_model_path, max_steps=100, device="cpu")
```
- *images*: **list[np.ndarray]**
- List of images that **MUST BE SAME SAME, SAME ORIENTATION (3x3, 4x4 puzzles, etc.)**.
- Each will be randomly initialized every `.reset()` of the environment
- *seg_model_path*: **string**
- Path to [image segmentation weight folder](https://huggingface.co/reeeemo/puzzle-segment-model)
- Note that any image segmentation that segments puzzle pieces will work, however they must be axis aligned
- The segmentation model is used to initialize another custom model found in the same github repo, [here](https://github.com/reeeeemo/how-puzzling/blob/main/model/model.py)
- *max_steps*: **int**
- Max number of steps allowed
- Using MaskablePPO solves the issue of infinite actions, but if you decide to use PPO, ensure `max_steps` is set to the max number of puzzle pieces
- *device*: **string**
- CPU or GPU usage
Training data can be found in [events.out.tfevents](./events.out.tfevents.0.500k_5_images_norm_rew) using `tensorboard --logdir .` after downloading the repo.
The environment was trained with `VecNormalize` from the `stable_baselines3` libary, you can load from [vec_normalize.pkl](./vec_normalize.pkl)
|