Spaces:

VortexedSquirrel
/

tetris-env

Sleeping

OutOfMystic Claude Opus 4.6 commited on Mar 9

Commit

3a5b76e

1 Parent(s): 8251fe9

v0.5.0: reduce game_over to -50, disable height breach penalty

Reward signal was dominated by constant penalties (-500 game over + ~-350 height breach),
leaving only ~5% learnable signal. Now learnable components are ~80% of total reward.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (1) hide show

src/tetris_env/server/game_engine.py +3 -3

src/tetris_env/server/game_engine.py CHANGED Viewed

@@ -2,7 +2,7 @@
 Tetris Environment for OpenEnv.
 Full game logic with combo scoring reward system.
 """
-__version__ = "0.4.0"  # decaying height breach penalty, hole penalty only new, step -0.1
 import random
 import copy
@@ -40,9 +40,9 @@ LINE_REWARDS = {
 STEP_PENALTY = -0.1
 HOLE_PENALTY_MULT = -5
-GAME_OVER_PENALTY = -500
 HEIGHT_BREACH_THRESHOLD = 4
-HEIGHT_BREACH_PENALTY = -50
 def rotate_cw(piece: list[list[int]]) -> list[list[int]]:

 Tetris Environment for OpenEnv.
 Full game logic with combo scoring reward system.
 """
+__version__ = "0.5.0"  # game_over -50, height breach OFF, LR 1e-4
 import random
 import copy
 STEP_PENALTY = -0.1
 HOLE_PENALTY_MULT = -5
+GAME_OVER_PENALTY = -50
 HEIGHT_BREACH_THRESHOLD = 4
+HEIGHT_BREACH_PENALTY = 0  # disabled for initial training
 def rotate_cw(piece: list[list[int]]) -> list[list[int]]: