| # ML-3m-trader: XAUUSDc 3-Minute Timeframe ML Trading System | |
| End-to-end machine learning pipeline for trading XAUUSDc (Gold) on the 3-minute timeframe. Uses MetaTrader 5 for data acquisition, LightGBM for classification, and a vectorized backtesting engine with realistic execution modeling. | |
| ## User Review Required | |
| > [!IMPORTANT] | |
| > **ML Framework Choice: LightGBM** β LightGBM is the best-suited framework for this task because: | |
| > - Tabular classification (Buy/Sell/Hold/DoNothing) is LightGBM's strongest domain | |
| > - Extremely fast training, even on CPU (i5-7200U will handle it fine) | |
| > - Low memory footprint (well within 12 GB RAM) | |
| > - No GPU required (your MX110 is not needed) | |
| > - Outperforms deep learning on structured/tabular data in virtually all benchmarks | |
| > | |
| > **This will run entirely on your local machine. No Google Colab needed.** | |
| > [!WARNING] | |
| > **MetaTrader 5 Python API** only works on Windows (which you have). MT5 must be open and logged in when running the data fetch script. The `MetaTrader5` pip package handles communication. | |
| > [!NOTE] | |
| > **VIX Feature**: Since the CBOE VIX index is not directly available from MT5, the system will compute a **synthetic VIX proxy** using a rolling standard deviation of returns (realized volatility), which is the standard approach in non-US-equity trading systems. If you want the actual VIX, we would need a separate data source. | |
| --- | |
| ## Proposed Changes | |
| ### Project Structure | |
| ``` | |
| ML-3m-trader/ | |
| βββ config.py # All configuration constants | |
| βββ data_fetcher.py # MT5 data acquisition | |
| βββ features.py # Technical indicator computation | |
| βββ labeler.py # Trade label generation (Buy/Sell/Hold/DoNothing) | |
| βββ model.py # LightGBM training, prediction, persistence | |
| βββ backtester.py # Vectorized backtesting engine | |
| βββ metrics.py # Performance evaluation | |
| βββ main.py # CLI entry point | |
| βββ requirements.txt | |
| βββ LICENSE | |
| βββ README.md | |
| βββ GUIDE.md # Step-by-step usage guide with tables | |
| βββ .gitignore | |
| ``` | |
| --- | |
| ### Configuration | |
| #### [NEW] [config.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/config.py) | |
| Central configuration file containing all tunable parameters: | |
| - `SYMBOL = "XAUUSDc"`, `TIMEFRAME = mt5.TIMEFRAME_M3` | |
| - Feature list, lookback periods for SMA (14, 50), VROC (14), ADX (14), Momentum SI (10) | |
| - Risk/reward ratio = 1.0, default bet percentage logic | |
| - Slippage range (0β2 units), spread filter (`stoploss_size >= spread * 10`) | |
| - Train/test split ratio, model hyperparameters | |
| - Starting equity/balance | |
| --- | |
| ### Data Acquisition | |
| #### [NEW] [data_fetcher.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/data_fetcher.py) | |
| - Connects to MT5 terminal via `MetaTrader5` Python package | |
| - Fetches 1-year of 3-minute OHLCV bars for XAUUSDc | |
| - Returns a `pandas.DataFrame` with columns: `time, open, high, low, close, volume, spread` | |
| - Saves raw data to `data/raw_xauusdc_3m.csv` for reproducibility | |
| - Handles MT5 connection errors gracefully | |
| --- | |
| ### Feature Engineering | |
| #### [NEW] [features.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/features.py) | |
| Computes all required technical indicators using pure NumPy/Pandas (no TA-Lib dependency): | |
| | Feature | Method | | |
| |---------|--------| | |
| | SMA | Simple Moving Average (14-period) | | |
| | Double Moving Average | SMA(14) and SMA(50), plus crossover signal | | |
| | VROC | Volume Rate of Change (14-period) | | |
| | Synthetic VIX | Rolling std of log-returns (20-period) as volatility proxy | | |
| | Momentum Strength Index | Custom momentum oscillator (10-period, 0β100 scale) | | |
| | ADX | Average Directional Index (14-period) via Wilder's smoothing | | |
| | Time features | Hour-of-day, minute-of-hour, day-of-week (cyclical encoded) | | |
| All computations are vectorized with NumPy for maximum speed. NaN rows from lookback periods are dropped. | |
| --- | |
| ### Labeling Engine | |
| #### [NEW] [labeler.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/labeler.py) | |
| Generates ground-truth labels for supervised learning: | |
| 1. For each bar, compute a potential **Buy** and **Sell** trade: | |
| - **Buy**: entry at `close`, SL below recent swing low (ATR-based), TP = entry + (entry - SL) (1:1 RR) | |
| - **Sell**: entry at `close`, SL above recent swing high (ATR-based), TP = entry - (SL - entry) (1:1 RR) | |
| 2. Walk forward through subsequent bars to determine outcome (TP hit, SL hit, or neither within N bars) | |
| 3. Apply **spread filter**: if `SL_distance < spread * 10`, label = `DO_NOTHING` | |
| 4. Final labels: `BUY_WIN`, `BUY_LOSS`, `SELL_WIN`, `SELL_LOSS`, `HOLD`, `DO_NOTHING` β simplified to 4-class: `BUY (1)`, `SELL (2)`, `HOLD (3)`, `DO_NOTHING (0)` | |
| 5. Only winning setups are labeled as BUY/SELL; losing setups become HOLD | |
| --- | |
| ### ML Model | |
| #### [NEW] [model.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/model.py) | |
| - **LightGBM** multi-class classifier (4 classes) | |
| - Hyperparameters tuned for tabular financial data: | |
| - `num_leaves=63`, `max_depth=8`, `learning_rate=0.05`, `n_estimators=500` | |
| - `subsample=0.8`, `colsample_bytree=0.8`, `min_child_samples=20` | |
| - `class_weight='balanced'` to handle label imbalance | |
| - Train/validation split: 80/20 chronological (no shuffle β time series) | |
| - Feature importance output | |
| - Model persistence via `joblib` (save/load `.pkl`) | |
| - Early stopping on validation set | |
| --- | |
| ### Backtesting Engine | |
| #### [NEW] [backtester.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/backtester.py) | |
| Vectorized backtesting with realistic execution: | |
| - Takes model predictions and raw price data | |
| - **Position sizing**: bet % of current balance, accounting for full SL distance | |
| - `lot_value = balance * bet_pct / sl_distance` | |
| - **Random slippage**: uniform 0β2 XAUUSDc units applied to entry price | |
| - **Spread filter**: skip trade if `sl_distance < spread * 10` | |
| - **1:1 Risk-Reward**: TP distance = SL distance | |
| - Walk forward bar-by-bar on test set, track equity curve | |
| - No trade limit β takes every valid signal | |
| - Records all trades with entry/exit prices, PnL, timestamps | |
| --- | |
| ### Metrics & Evaluation | |
| #### [NEW] [metrics.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/metrics.py) | |
| | Metric | Description | | |
| |--------|-------------| | |
| | Win Rate | % of trades closed at TP | | |
| | Average Win % | Mean profit per winning trade as % of balance | | |
| | Average Loss % | Mean loss per losing trade as % of balance | | |
| | Sharpe Ratio | Annualized risk-adjusted return | | |
| | Sortino Ratio | Downside-risk-adjusted return | | |
| | Max Drawdown | Largest peak-to-trough equity decline | | |
| | Profit Factor | Gross profit / Gross loss | | |
| | Start Equity | Initial balance | | |
| | End Equity | Final balance after all trades | | |
| | Total Trades | Number of executed trades | | |
| | Avg Trade Duration | Mean holding time in bars/minutes | | |
| | Daily PnL Stats | Intraday mean, std, min, max PnL | | |
| | Calmar Ratio | Annualized return / Max Drawdown | | |
| | Expectancy | Average PnL per trade | | |
| Outputs a formatted console report and saves to `results/report.txt`. | |
| --- | |
| ### CLI Entry Point | |
| #### [NEW] [main.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/main.py) | |
| Unified CLI with subcommands: | |
| ``` | |
| python main.py fetch # Fetch 1-year data from MT5 | |
| python main.py train # Engineer features, label, train model | |
| python main.py backtest # Run backtest on test set | |
| python main.py evaluate # Print metrics report | |
| python main.py run # Full pipeline: fetch β train β backtest β evaluate | |
| ``` | |
| Uses `argparse` with clear help text. | |
| --- | |
| ### Project Files | |
| #### [NEW] [requirements.txt](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/requirements.txt) | |
| ``` | |
| MetaTrader5>=5.0.45 | |
| lightgbm>=4.0.0 | |
| pandas>=2.0.0 | |
| numpy>=1.24.0 | |
| scikit-learn>=1.3.0 | |
| joblib>=1.3.0 | |
| ``` | |
| #### [NEW] [LICENSE](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/LICENSE) | |
| MIT License, author: Rembrant Oyangoren Albeos, year: 2026. | |
| #### [NEW] [README.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/README.md) | |
| Professional README with badges (Python, License, LightGBM), project description, features list, quick start, architecture overview, and configuration reference. No emojis. | |
| #### [NEW] [GUIDE.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/GUIDE.md) | |
| Step-by-step usage guide with tables for all commands, parameters, and expected outputs. | |
| #### [NEW] [.gitignore](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/.gitignore) | |
| Standard Python gitignore plus `data/`, `results/`, `models/`, `*.pkl`. | |
| --- | |
| ## Verification Plan | |
| ### Automated Tests | |
| 1. **Syntax validation** β run `python -m py_compile <file>` on every `.py` file to confirm no syntax errors | |
| 2. **Import validation** β run `python -c "import config; import features; import labeler; import model; import backtester; import metrics"` to confirm all modules load correctly | |
| 3. **Dry-run test** β run `python main.py --help` to confirm CLI is functional | |
| ### Manual Verification | |
| 1. **User runs `python main.py fetch`** with MT5 open and logged in, confirms data CSV is created in `data/` | |
| 2. **User runs `python main.py run`** for the full pipeline, reviews the metrics report output | |
| 3. **User inspects `results/report.txt`** for the performance metrics | |