# ML-3m-trader: XAUUSDc 3-Minute Timeframe ML Trading System End-to-end machine learning pipeline for trading XAUUSDc (Gold) on the 3-minute timeframe. Uses MetaTrader 5 for data acquisition, LightGBM for classification, and a vectorized backtesting engine with realistic execution modeling. ## User Review Required > [!IMPORTANT] > **ML Framework Choice: LightGBM** — LightGBM is the best-suited framework for this task because: > - Tabular classification (Buy/Sell/Hold/DoNothing) is LightGBM's strongest domain > - Extremely fast training, even on CPU (i5-7200U will handle it fine) > - Low memory footprint (well within 12 GB RAM) > - No GPU required (your MX110 is not needed) > - Outperforms deep learning on structured/tabular data in virtually all benchmarks > > **This will run entirely on your local machine. No Google Colab needed.** > [!WARNING] > **MetaTrader 5 Python API** only works on Windows (which you have). MT5 must be open and logged in when running the data fetch script. The `MetaTrader5` pip package handles communication. > [!NOTE] > **VIX Feature**: Since the CBOE VIX index is not directly available from MT5, the system will compute a **synthetic VIX proxy** using a rolling standard deviation of returns (realized volatility), which is the standard approach in non-US-equity trading systems. If you want the actual VIX, we would need a separate data source. --- ## Proposed Changes ### Project Structure ``` ML-3m-trader/ ├── config.py # All configuration constants ├── data_fetcher.py # MT5 data acquisition ├── features.py # Technical indicator computation ├── labeler.py # Trade label generation (Buy/Sell/Hold/DoNothing) ├── model.py # LightGBM training, prediction, persistence ├── backtester.py # Vectorized backtesting engine ├── metrics.py # Performance evaluation ├── main.py # CLI entry point ├── requirements.txt ├── LICENSE ├── README.md ├── GUIDE.md # Step-by-step usage guide with tables └── .gitignore ``` --- ### Configuration #### [NEW] [config.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/config.py) Central configuration file containing all tunable parameters: - `SYMBOL = "XAUUSDc"`, `TIMEFRAME = mt5.TIMEFRAME_M3` - Feature list, lookback periods for SMA (14, 50), VROC (14), ADX (14), Momentum SI (10) - Risk/reward ratio = 1.0, default bet percentage logic - Slippage range (0–2 units), spread filter (`stoploss_size >= spread * 10`) - Train/test split ratio, model hyperparameters - Starting equity/balance --- ### Data Acquisition #### [NEW] [data_fetcher.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/data_fetcher.py) - Connects to MT5 terminal via `MetaTrader5` Python package - Fetches 1-year of 3-minute OHLCV bars for XAUUSDc - Returns a `pandas.DataFrame` with columns: `time, open, high, low, close, volume, spread` - Saves raw data to `data/raw_xauusdc_3m.csv` for reproducibility - Handles MT5 connection errors gracefully --- ### Feature Engineering #### [NEW] [features.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/features.py) Computes all required technical indicators using pure NumPy/Pandas (no TA-Lib dependency): | Feature | Method | |---------|--------| | SMA | Simple Moving Average (14-period) | | Double Moving Average | SMA(14) and SMA(50), plus crossover signal | | VROC | Volume Rate of Change (14-period) | | Synthetic VIX | Rolling std of log-returns (20-period) as volatility proxy | | Momentum Strength Index | Custom momentum oscillator (10-period, 0–100 scale) | | ADX | Average Directional Index (14-period) via Wilder's smoothing | | Time features | Hour-of-day, minute-of-hour, day-of-week (cyclical encoded) | All computations are vectorized with NumPy for maximum speed. NaN rows from lookback periods are dropped. --- ### Labeling Engine #### [NEW] [labeler.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/labeler.py) Generates ground-truth labels for supervised learning: 1. For each bar, compute a potential **Buy** and **Sell** trade: - **Buy**: entry at `close`, SL below recent swing low (ATR-based), TP = entry + (entry - SL) (1:1 RR) - **Sell**: entry at `close`, SL above recent swing high (ATR-based), TP = entry - (SL - entry) (1:1 RR) 2. Walk forward through subsequent bars to determine outcome (TP hit, SL hit, or neither within N bars) 3. Apply **spread filter**: if `SL_distance < spread * 10`, label = `DO_NOTHING` 4. Final labels: `BUY_WIN`, `BUY_LOSS`, `SELL_WIN`, `SELL_LOSS`, `HOLD`, `DO_NOTHING` → simplified to 4-class: `BUY (1)`, `SELL (2)`, `HOLD (3)`, `DO_NOTHING (0)` 5. Only winning setups are labeled as BUY/SELL; losing setups become HOLD --- ### ML Model #### [NEW] [model.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/model.py) - **LightGBM** multi-class classifier (4 classes) - Hyperparameters tuned for tabular financial data: - `num_leaves=63`, `max_depth=8`, `learning_rate=0.05`, `n_estimators=500` - `subsample=0.8`, `colsample_bytree=0.8`, `min_child_samples=20` - `class_weight='balanced'` to handle label imbalance - Train/validation split: 80/20 chronological (no shuffle — time series) - Feature importance output - Model persistence via `joblib` (save/load `.pkl`) - Early stopping on validation set --- ### Backtesting Engine #### [NEW] [backtester.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/backtester.py) Vectorized backtesting with realistic execution: - Takes model predictions and raw price data - **Position sizing**: bet % of current balance, accounting for full SL distance - `lot_value = balance * bet_pct / sl_distance` - **Random slippage**: uniform 0–2 XAUUSDc units applied to entry price - **Spread filter**: skip trade if `sl_distance < spread * 10` - **1:1 Risk-Reward**: TP distance = SL distance - Walk forward bar-by-bar on test set, track equity curve - No trade limit — takes every valid signal - Records all trades with entry/exit prices, PnL, timestamps --- ### Metrics & Evaluation #### [NEW] [metrics.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/metrics.py) | Metric | Description | |--------|-------------| | Win Rate | % of trades closed at TP | | Average Win % | Mean profit per winning trade as % of balance | | Average Loss % | Mean loss per losing trade as % of balance | | Sharpe Ratio | Annualized risk-adjusted return | | Sortino Ratio | Downside-risk-adjusted return | | Max Drawdown | Largest peak-to-trough equity decline | | Profit Factor | Gross profit / Gross loss | | Start Equity | Initial balance | | End Equity | Final balance after all trades | | Total Trades | Number of executed trades | | Avg Trade Duration | Mean holding time in bars/minutes | | Daily PnL Stats | Intraday mean, std, min, max PnL | | Calmar Ratio | Annualized return / Max Drawdown | | Expectancy | Average PnL per trade | Outputs a formatted console report and saves to `results/report.txt`. --- ### CLI Entry Point #### [NEW] [main.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/main.py) Unified CLI with subcommands: ``` python main.py fetch # Fetch 1-year data from MT5 python main.py train # Engineer features, label, train model python main.py backtest # Run backtest on test set python main.py evaluate # Print metrics report python main.py run # Full pipeline: fetch → train → backtest → evaluate ``` Uses `argparse` with clear help text. --- ### Project Files #### [NEW] [requirements.txt](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/requirements.txt) ``` MetaTrader5>=5.0.45 lightgbm>=4.0.0 pandas>=2.0.0 numpy>=1.24.0 scikit-learn>=1.3.0 joblib>=1.3.0 ``` #### [NEW] [LICENSE](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/LICENSE) MIT License, author: Rembrant Oyangoren Albeos, year: 2026. #### [NEW] [README.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/README.md) Professional README with badges (Python, License, LightGBM), project description, features list, quick start, architecture overview, and configuration reference. No emojis. #### [NEW] [GUIDE.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/GUIDE.md) Step-by-step usage guide with tables for all commands, parameters, and expected outputs. #### [NEW] [.gitignore](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/.gitignore) Standard Python gitignore plus `data/`, `results/`, `models/`, `*.pkl`. --- ## Verification Plan ### Automated Tests 1. **Syntax validation** — run `python -m py_compile ` on every `.py` file to confirm no syntax errors 2. **Import validation** — run `python -c "import config; import features; import labeler; import model; import backtester; import metrics"` to confirm all modules load correctly 3. **Dry-run test** — run `python main.py --help` to confirm CLI is functional ### Manual Verification 1. **User runs `python main.py fetch`** with MT5 open and logged in, confirms data CSV is created in `data/` 2. **User runs `python main.py run`** for the full pipeline, reviews the metrics report output 3. **User inspects `results/report.txt`** for the performance metrics