refactor: normalize reward range to [0.01, 0.99] and standardize episode scoring key to score across environment and inference logic da09194 Running
Gamucopia-Creatives commited on