File size: 9,263 Bytes

c382c53

# ML-3m-trader: XAUUSDc 3-Minute Timeframe ML Trading System

End-to-end machine learning pipeline for trading XAUUSDc (Gold) on the 3-minute timeframe. Uses MetaTrader 5 for data acquisition, LightGBM for classification, and a vectorized backtesting engine with realistic execution modeling.

## User Review Required

> [!IMPORTANT]
> **ML Framework Choice: LightGBM** — LightGBM is the best-suited framework for this task because:
> - Tabular classification (Buy/Sell/Hold/DoNothing) is LightGBM's strongest domain
> - Extremely fast training, even on CPU (i5-7200U will handle it fine)
> - Low memory footprint (well within 12 GB RAM)
> - No GPU required (your MX110 is not needed)
> - Outperforms deep learning on structured/tabular data in virtually all benchmarks
>
> **This will run entirely on your local machine. No Google Colab needed.**

> [!WARNING]
> **MetaTrader 5 Python API** only works on Windows (which you have). MT5 must be open and logged in when running the data fetch script. The `MetaTrader5` pip package handles communication.

> [!NOTE]
> **VIX Feature**: Since the CBOE VIX index is not directly available from MT5, the system will compute a **synthetic VIX proxy** using a rolling standard deviation of returns (realized volatility), which is the standard approach in non-US-equity trading systems. If you want the actual VIX, we would need a separate data source.

---

## Proposed Changes

### Project Structure

```
ML-3m-trader/
├── config.py             # All configuration constants
├── data_fetcher.py       # MT5 data acquisition
├── features.py           # Technical indicator computation
├── labeler.py            # Trade label generation (Buy/Sell/Hold/DoNothing)
├── model.py              # LightGBM training, prediction, persistence
├── backtester.py         # Vectorized backtesting engine
├── metrics.py            # Performance evaluation
├── main.py               # CLI entry point
├── requirements.txt
├── LICENSE
├── README.md
├── GUIDE.md              # Step-by-step usage guide with tables
└── .gitignore
```

---

### Configuration

#### [NEW] [config.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/config.py)

Central configuration file containing all tunable parameters:
- `SYMBOL = "XAUUSDc"`, `TIMEFRAME = mt5.TIMEFRAME_M3`
- Feature list, lookback periods for SMA (14, 50), VROC (14), ADX (14), Momentum SI (10)
- Risk/reward ratio = 1.0, default bet percentage logic
- Slippage range (0–2 units), spread filter (`stoploss_size >= spread * 10`)
- Train/test split ratio, model hyperparameters
- Starting equity/balance

---

### Data Acquisition

#### [NEW] [data_fetcher.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/data_fetcher.py)

- Connects to MT5 terminal via `MetaTrader5` Python package
- Fetches 1-year of 3-minute OHLCV bars for XAUUSDc
- Returns a `pandas.DataFrame` with columns: `time, open, high, low, close, volume, spread`
- Saves raw data to `data/raw_xauusdc_3m.csv` for reproducibility
- Handles MT5 connection errors gracefully

---

### Feature Engineering

#### [NEW] [features.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/features.py)

Computes all required technical indicators using pure NumPy/Pandas (no TA-Lib dependency):

| Feature | Method |
|---------|--------|
| SMA | Simple Moving Average (14-period) |
| Double Moving Average | SMA(14) and SMA(50), plus crossover signal |
| VROC | Volume Rate of Change (14-period) |
| Synthetic VIX | Rolling std of log-returns (20-period) as volatility proxy |
| Momentum Strength Index | Custom momentum oscillator (10-period, 0–100 scale) |
| ADX | Average Directional Index (14-period) via Wilder's smoothing |
| Time features | Hour-of-day, minute-of-hour, day-of-week (cyclical encoded) |

All computations are vectorized with NumPy for maximum speed. NaN rows from lookback periods are dropped.

---

### Labeling Engine

#### [NEW] [labeler.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/labeler.py)

Generates ground-truth labels for supervised learning:

1. For each bar, compute a potential **Buy** and **Sell** trade:
   - **Buy**: entry at `close`, SL below recent swing low (ATR-based), TP = entry + (entry - SL) (1:1 RR)
   - **Sell**: entry at `close`, SL above recent swing high (ATR-based), TP = entry - (SL - entry) (1:1 RR)
2. Walk forward through subsequent bars to determine outcome (TP hit, SL hit, or neither within N bars)
3. Apply **spread filter**: if `SL_distance < spread * 10`, label = `DO_NOTHING`
4. Final labels: `BUY_WIN`, `BUY_LOSS`, `SELL_WIN`, `SELL_LOSS`, `HOLD`, `DO_NOTHING` → simplified to 4-class: `BUY (1)`, `SELL (2)`, `HOLD (3)`, `DO_NOTHING (0)`
5. Only winning setups are labeled as BUY/SELL; losing setups become HOLD

---

### ML Model

#### [NEW] [model.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/model.py)

- **LightGBM** multi-class classifier (4 classes)
- Hyperparameters tuned for tabular financial data:
  - `num_leaves=63`, `max_depth=8`, `learning_rate=0.05`, `n_estimators=500`
  - `subsample=0.8`, `colsample_bytree=0.8`, `min_child_samples=20`
  - `class_weight='balanced'` to handle label imbalance
- Train/validation split: 80/20 chronological (no shuffle — time series)
- Feature importance output
- Model persistence via `joblib` (save/load `.pkl`)
- Early stopping on validation set

---

### Backtesting Engine

#### [NEW] [backtester.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/backtester.py)

Vectorized backtesting with realistic execution:

- Takes model predictions and raw price data
- **Position sizing**: bet % of current balance, accounting for full SL distance
  - `lot_value = balance * bet_pct / sl_distance`
- **Random slippage**: uniform 0–2 XAUUSDc units applied to entry price
- **Spread filter**: skip trade if `sl_distance < spread * 10`
- **1:1 Risk-Reward**: TP distance = SL distance
- Walk forward bar-by-bar on test set, track equity curve
- No trade limit — takes every valid signal
- Records all trades with entry/exit prices, PnL, timestamps

---

### Metrics & Evaluation

#### [NEW] [metrics.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/metrics.py)

| Metric | Description |
|--------|-------------|
| Win Rate | % of trades closed at TP |
| Average Win % | Mean profit per winning trade as % of balance |
| Average Loss % | Mean loss per losing trade as % of balance |
| Sharpe Ratio | Annualized risk-adjusted return |
| Sortino Ratio | Downside-risk-adjusted return |
| Max Drawdown | Largest peak-to-trough equity decline |
| Profit Factor | Gross profit / Gross loss |
| Start Equity | Initial balance |
| End Equity | Final balance after all trades |
| Total Trades | Number of executed trades |
| Avg Trade Duration | Mean holding time in bars/minutes |
| Daily PnL Stats | Intraday mean, std, min, max PnL |
| Calmar Ratio | Annualized return / Max Drawdown |
| Expectancy | Average PnL per trade |

Outputs a formatted console report and saves to `results/report.txt`.

---

### CLI Entry Point

#### [NEW] [main.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/main.py)

Unified CLI with subcommands:

```
python main.py fetch       # Fetch 1-year data from MT5
python main.py train       # Engineer features, label, train model
python main.py backtest    # Run backtest on test set
python main.py evaluate    # Print metrics report
python main.py run         # Full pipeline: fetch → train → backtest → evaluate
```

Uses `argparse` with clear help text.

---

### Project Files

#### [NEW] [requirements.txt](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/requirements.txt)

```
MetaTrader5>=5.0.45
lightgbm>=4.0.0
pandas>=2.0.0
numpy>=1.24.0
scikit-learn>=1.3.0
joblib>=1.3.0
```

#### [NEW] [LICENSE](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/LICENSE)

MIT License, author: Rembrant Oyangoren Albeos, year: 2026.

#### [NEW] [README.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/README.md)

Professional README with badges (Python, License, LightGBM), project description, features list, quick start, architecture overview, and configuration reference. No emojis.

#### [NEW] [GUIDE.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/GUIDE.md)

Step-by-step usage guide with tables for all commands, parameters, and expected outputs.

#### [NEW] [.gitignore](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/.gitignore)

Standard Python gitignore plus `data/`, `results/`, `models/`, `*.pkl`.

---

## Verification Plan

### Automated Tests

1. **Syntax validation** — run `python -m py_compile <file>` on every `.py` file to confirm no syntax errors
2. **Import validation** — run `python -c "import config; import features; import labeler; import model; import backtester; import metrics"` to confirm all modules load correctly
3. **Dry-run test** — run `python main.py --help` to confirm CLI is functional

### Manual Verification

1. **User runs `python main.py fetch`** with MT5 open and logged in, confirms data CSV is created in `data/`
2. **User runs `python main.py run`** for the full pipeline, reviews the metrics report output
3. **User inspects `results/report.txt`** for the performance metrics