Bernoulli
/

MasyuLLMAgent

Text Generation

Model card Files Files and versions

Bernoulli commited on Oct 2, 2025

Commit

af2155d

·

verified ·

1 Parent(s): 39c0f1e

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -8,4 +8,8 @@ pipeline_tag: text-generation
 tags:
 - agent
 ---
-**LLM agent** that learns to solve the logic puzzle **Masyu** (Necklace) using **Reinforcement Learning** with the **GRPO algorithm**.

 tags:
 - agent
 ---
+This repository has weights of an **LLM agent** that learns to solve the logic puzzle **Masyu** (Necklace) using **Reinforcement Learning** with the **GRPO algorithm**.
+![alt text](./masyu_bars.png)
+These are my results of training Qwen/Qwen2-1.5B-Instruct. Due to constraints on available computational resources, a significant improvement in performance was primarily achieved for the first four difficulty levels. More extensive training—with more steps, a larger base model, or higher num_generations—would likely be required to achieve improvements on more complex puzzles.