ppo rnn with bootstrap last value on episode end and 1k window length: 1) active return 3e-4 pi lr, 2) active return penalized turnovers 3e-4 pi lr, 3) sharpe based reward 0a12bae bobotsalos commited on Mar 16
finish with ppo rnn with sharpe ratio reward base/cidl features 9ac6e20 bobotsalos commited on Mar 12
progress (1/5) train ppo rnn with sharpe ratio reward base/cidl features f166bce bobotsalos commited on Mar 12