File size: 259 Bytes
9e64e71
 
1
2
3
# Learnings - Architecture
- Keep behavior-shaping reward logic inside `SQLEnvTRL` as additive trajectory-level state (`reward`, `_repeat_count`) so tool method signatures and TRL environment interfaces remain stable while internal semantics evolve. *(F015)*