""" RL BTC v4 — Offline Implicit Q-Learning for Bitcoin Trading. Based on: "Offline Reinforcement Learning with Implicit Q-Learning" (Kostrikov et al., 2021) https://hf.co/papers/2110.06169 Key innovation: learns from logged historical data without environment interaction. Uses upper expectile value function to estimate the value of the best actions without ever explicitly querying out-of-distribution actions. """ from .constants import ( DEFAULT_DATA_PATH, MARKET_FEATURE_COLUMNS, PORTFOLIO_FEATURE_COLUMNS, ACTIONS, N_ACTIONS, ACTION_INDEX_BY_NAME, STARTING_CASH, DRAWDOWN_LIMIT, ) from .env import BTCTradingEnv from .dataset import build_offline_rl_dataset, OfflineRLDataset from .iql_trainer import IQLTrainer, IQLConfig