File size: 9,263 Bytes
c382c53 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | # ML-3m-trader: XAUUSDc 3-Minute Timeframe ML Trading System
End-to-end machine learning pipeline for trading XAUUSDc (Gold) on the 3-minute timeframe. Uses MetaTrader 5 for data acquisition, LightGBM for classification, and a vectorized backtesting engine with realistic execution modeling.
## User Review Required
> [!IMPORTANT]
> **ML Framework Choice: LightGBM** β LightGBM is the best-suited framework for this task because:
> - Tabular classification (Buy/Sell/Hold/DoNothing) is LightGBM's strongest domain
> - Extremely fast training, even on CPU (i5-7200U will handle it fine)
> - Low memory footprint (well within 12 GB RAM)
> - No GPU required (your MX110 is not needed)
> - Outperforms deep learning on structured/tabular data in virtually all benchmarks
>
> **This will run entirely on your local machine. No Google Colab needed.**
> [!WARNING]
> **MetaTrader 5 Python API** only works on Windows (which you have). MT5 must be open and logged in when running the data fetch script. The `MetaTrader5` pip package handles communication.
> [!NOTE]
> **VIX Feature**: Since the CBOE VIX index is not directly available from MT5, the system will compute a **synthetic VIX proxy** using a rolling standard deviation of returns (realized volatility), which is the standard approach in non-US-equity trading systems. If you want the actual VIX, we would need a separate data source.
---
## Proposed Changes
### Project Structure
```
ML-3m-trader/
βββ config.py # All configuration constants
βββ data_fetcher.py # MT5 data acquisition
βββ features.py # Technical indicator computation
βββ labeler.py # Trade label generation (Buy/Sell/Hold/DoNothing)
βββ model.py # LightGBM training, prediction, persistence
βββ backtester.py # Vectorized backtesting engine
βββ metrics.py # Performance evaluation
βββ main.py # CLI entry point
βββ requirements.txt
βββ LICENSE
βββ README.md
βββ GUIDE.md # Step-by-step usage guide with tables
βββ .gitignore
```
---
### Configuration
#### [NEW] [config.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/config.py)
Central configuration file containing all tunable parameters:
- `SYMBOL = "XAUUSDc"`, `TIMEFRAME = mt5.TIMEFRAME_M3`
- Feature list, lookback periods for SMA (14, 50), VROC (14), ADX (14), Momentum SI (10)
- Risk/reward ratio = 1.0, default bet percentage logic
- Slippage range (0β2 units), spread filter (`stoploss_size >= spread * 10`)
- Train/test split ratio, model hyperparameters
- Starting equity/balance
---
### Data Acquisition
#### [NEW] [data_fetcher.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/data_fetcher.py)
- Connects to MT5 terminal via `MetaTrader5` Python package
- Fetches 1-year of 3-minute OHLCV bars for XAUUSDc
- Returns a `pandas.DataFrame` with columns: `time, open, high, low, close, volume, spread`
- Saves raw data to `data/raw_xauusdc_3m.csv` for reproducibility
- Handles MT5 connection errors gracefully
---
### Feature Engineering
#### [NEW] [features.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/features.py)
Computes all required technical indicators using pure NumPy/Pandas (no TA-Lib dependency):
| Feature | Method |
|---------|--------|
| SMA | Simple Moving Average (14-period) |
| Double Moving Average | SMA(14) and SMA(50), plus crossover signal |
| VROC | Volume Rate of Change (14-period) |
| Synthetic VIX | Rolling std of log-returns (20-period) as volatility proxy |
| Momentum Strength Index | Custom momentum oscillator (10-period, 0β100 scale) |
| ADX | Average Directional Index (14-period) via Wilder's smoothing |
| Time features | Hour-of-day, minute-of-hour, day-of-week (cyclical encoded) |
All computations are vectorized with NumPy for maximum speed. NaN rows from lookback periods are dropped.
---
### Labeling Engine
#### [NEW] [labeler.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/labeler.py)
Generates ground-truth labels for supervised learning:
1. For each bar, compute a potential **Buy** and **Sell** trade:
- **Buy**: entry at `close`, SL below recent swing low (ATR-based), TP = entry + (entry - SL) (1:1 RR)
- **Sell**: entry at `close`, SL above recent swing high (ATR-based), TP = entry - (SL - entry) (1:1 RR)
2. Walk forward through subsequent bars to determine outcome (TP hit, SL hit, or neither within N bars)
3. Apply **spread filter**: if `SL_distance < spread * 10`, label = `DO_NOTHING`
4. Final labels: `BUY_WIN`, `BUY_LOSS`, `SELL_WIN`, `SELL_LOSS`, `HOLD`, `DO_NOTHING` β simplified to 4-class: `BUY (1)`, `SELL (2)`, `HOLD (3)`, `DO_NOTHING (0)`
5. Only winning setups are labeled as BUY/SELL; losing setups become HOLD
---
### ML Model
#### [NEW] [model.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/model.py)
- **LightGBM** multi-class classifier (4 classes)
- Hyperparameters tuned for tabular financial data:
- `num_leaves=63`, `max_depth=8`, `learning_rate=0.05`, `n_estimators=500`
- `subsample=0.8`, `colsample_bytree=0.8`, `min_child_samples=20`
- `class_weight='balanced'` to handle label imbalance
- Train/validation split: 80/20 chronological (no shuffle β time series)
- Feature importance output
- Model persistence via `joblib` (save/load `.pkl`)
- Early stopping on validation set
---
### Backtesting Engine
#### [NEW] [backtester.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/backtester.py)
Vectorized backtesting with realistic execution:
- Takes model predictions and raw price data
- **Position sizing**: bet % of current balance, accounting for full SL distance
- `lot_value = balance * bet_pct / sl_distance`
- **Random slippage**: uniform 0β2 XAUUSDc units applied to entry price
- **Spread filter**: skip trade if `sl_distance < spread * 10`
- **1:1 Risk-Reward**: TP distance = SL distance
- Walk forward bar-by-bar on test set, track equity curve
- No trade limit β takes every valid signal
- Records all trades with entry/exit prices, PnL, timestamps
---
### Metrics & Evaluation
#### [NEW] [metrics.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/metrics.py)
| Metric | Description |
|--------|-------------|
| Win Rate | % of trades closed at TP |
| Average Win % | Mean profit per winning trade as % of balance |
| Average Loss % | Mean loss per losing trade as % of balance |
| Sharpe Ratio | Annualized risk-adjusted return |
| Sortino Ratio | Downside-risk-adjusted return |
| Max Drawdown | Largest peak-to-trough equity decline |
| Profit Factor | Gross profit / Gross loss |
| Start Equity | Initial balance |
| End Equity | Final balance after all trades |
| Total Trades | Number of executed trades |
| Avg Trade Duration | Mean holding time in bars/minutes |
| Daily PnL Stats | Intraday mean, std, min, max PnL |
| Calmar Ratio | Annualized return / Max Drawdown |
| Expectancy | Average PnL per trade |
Outputs a formatted console report and saves to `results/report.txt`.
---
### CLI Entry Point
#### [NEW] [main.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/main.py)
Unified CLI with subcommands:
```
python main.py fetch # Fetch 1-year data from MT5
python main.py train # Engineer features, label, train model
python main.py backtest # Run backtest on test set
python main.py evaluate # Print metrics report
python main.py run # Full pipeline: fetch β train β backtest β evaluate
```
Uses `argparse` with clear help text.
---
### Project Files
#### [NEW] [requirements.txt](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/requirements.txt)
```
MetaTrader5>=5.0.45
lightgbm>=4.0.0
pandas>=2.0.0
numpy>=1.24.0
scikit-learn>=1.3.0
joblib>=1.3.0
```
#### [NEW] [LICENSE](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/LICENSE)
MIT License, author: Rembrant Oyangoren Albeos, year: 2026.
#### [NEW] [README.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/README.md)
Professional README with badges (Python, License, LightGBM), project description, features list, quick start, architecture overview, and configuration reference. No emojis.
#### [NEW] [GUIDE.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/GUIDE.md)
Step-by-step usage guide with tables for all commands, parameters, and expected outputs.
#### [NEW] [.gitignore](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/.gitignore)
Standard Python gitignore plus `data/`, `results/`, `models/`, `*.pkl`.
---
## Verification Plan
### Automated Tests
1. **Syntax validation** β run `python -m py_compile <file>` on every `.py` file to confirm no syntax errors
2. **Import validation** β run `python -c "import config; import features; import labeler; import model; import backtester; import metrics"` to confirm all modules load correctly
3. **Dry-run test** β run `python main.py --help` to confirm CLI is functional
### Manual Verification
1. **User runs `python main.py fetch`** with MT5 open and logged in, confirms data CSV is created in `data/`
2. **User runs `python main.py run`** for the full pipeline, reviews the metrics report output
3. **User inspects `results/report.txt`** for the performance metrics
|