Upload 11 files

c382c53 verified 10 days ago

9.26 kB

	# ML-3m-trader: XAUUSDc 3-Minute Timeframe ML Trading System

	End-to-end machine learning pipeline for trading XAUUSDc (Gold) on the 3-minute timeframe. Uses MetaTrader 5 for data acquisition, LightGBM for classification, and a vectorized backtesting engine with realistic execution modeling.

	## User Review Required

	> [!IMPORTANT]
	> ML Framework Choice: LightGBM — LightGBM is the best-suited framework for this task because:
	> - Tabular classification (Buy/Sell/Hold/DoNothing) is LightGBM's strongest domain
	> - Extremely fast training, even on CPU (i5-7200U will handle it fine)
	> - Low memory footprint (well within 12 GB RAM)
	> - No GPU required (your MX110 is not needed)
	> - Outperforms deep learning on structured/tabular data in virtually all benchmarks
	>
	> This will run entirely on your local machine. No Google Colab needed.

	> [!WARNING]
	> MetaTrader 5 Python API only works on Windows (which you have). MT5 must be open and logged in when running the data fetch script. The `MetaTrader5` pip package handles communication.

	> [!NOTE]
	> VIX Feature: Since the CBOE VIX index is not directly available from MT5, the system will compute a synthetic VIX proxy using a rolling standard deviation of returns (realized volatility), which is the standard approach in non-US-equity trading systems. If you want the actual VIX, we would need a separate data source.

	---

	## Proposed Changes

	### Project Structure

	```
	ML-3m-trader/
	├── config.py # All configuration constants
	├── data_fetcher.py # MT5 data acquisition
	├── features.py # Technical indicator computation
	├── labeler.py # Trade label generation (Buy/Sell/Hold/DoNothing)
	├── model.py # LightGBM training, prediction, persistence
	├── backtester.py # Vectorized backtesting engine
	├── metrics.py # Performance evaluation
	├── main.py # CLI entry point
	├── requirements.txt
	├── LICENSE
	├── README.md
	├── GUIDE.md # Step-by-step usage guide with tables
	└── .gitignore
	```

	---

	### Configuration

	#### [NEW] [config.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/config.py)

	Central configuration file containing all tunable parameters:
	- `SYMBOL = "XAUUSDc"`, `TIMEFRAME = mt5.TIMEFRAME_M3`
	- Feature list, lookback periods for SMA (14, 50), VROC (14), ADX (14), Momentum SI (10)
	- Risk/reward ratio = 1.0, default bet percentage logic
	- Slippage range (0–2 units), spread filter (`stoploss_size >= spread * 10`)
	- Train/test split ratio, model hyperparameters
	- Starting equity/balance

	---

	### Data Acquisition

	#### [NEW] [data_fetcher.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/data_fetcher.py)

	- Connects to MT5 terminal via `MetaTrader5` Python package
	- Fetches 1-year of 3-minute OHLCV bars for XAUUSDc
	- Returns a `pandas.DataFrame` with columns: `time, open, high, low, close, volume, spread`
	- Saves raw data to `data/raw_xauusdc_3m.csv` for reproducibility
	- Handles MT5 connection errors gracefully

	---

	### Feature Engineering

	#### [NEW] [features.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/features.py)

	Computes all required technical indicators using pure NumPy/Pandas (no TA-Lib dependency):

	\| Feature \| Method \|
	\|---------\|--------\|
	\| SMA \| Simple Moving Average (14-period) \|
	\| Double Moving Average \| SMA(14) and SMA(50), plus crossover signal \|
	\| VROC \| Volume Rate of Change (14-period) \|
	\| Synthetic VIX \| Rolling std of log-returns (20-period) as volatility proxy \|
	\| Momentum Strength Index \| Custom momentum oscillator (10-period, 0–100 scale) \|
	\| ADX \| Average Directional Index (14-period) via Wilder's smoothing \|
	\| Time features \| Hour-of-day, minute-of-hour, day-of-week (cyclical encoded) \|

	All computations are vectorized with NumPy for maximum speed. NaN rows from lookback periods are dropped.

	---

	### Labeling Engine

	#### [NEW] [labeler.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/labeler.py)

	Generates ground-truth labels for supervised learning:

	1. For each bar, compute a potential Buy and Sell trade:
	- Buy: entry at `close`, SL below recent swing low (ATR-based), TP = entry + (entry - SL) (1:1 RR)
	- Sell: entry at `close`, SL above recent swing high (ATR-based), TP = entry - (SL - entry) (1:1 RR)
	2. Walk forward through subsequent bars to determine outcome (TP hit, SL hit, or neither within N bars)
	3. Apply spread filter: if `SL_distance < spread * 10`, label = `DO_NOTHING`
	4. Final labels: `BUY_WIN`, `BUY_LOSS`, `SELL_WIN`, `SELL_LOSS`, `HOLD`, `DO_NOTHING` → simplified to 4-class: `BUY (1)`, `SELL (2)`, `HOLD (3)`, `DO_NOTHING (0)`
	5. Only winning setups are labeled as BUY/SELL; losing setups become HOLD

	---

	### ML Model

	#### [NEW] [model.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/model.py)

	- LightGBM multi-class classifier (4 classes)
	- Hyperparameters tuned for tabular financial data:
	- `num_leaves=63`, `max_depth=8`, `learning_rate=0.05`, `n_estimators=500`
	- `subsample=0.8`, `colsample_bytree=0.8`, `min_child_samples=20`
	- `class_weight='balanced'` to handle label imbalance
	- Train/validation split: 80/20 chronological (no shuffle — time series)
	- Feature importance output
	- Model persistence via `joblib` (save/load `.pkl`)
	- Early stopping on validation set

	---

	### Backtesting Engine

	#### [NEW] [backtester.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/backtester.py)

	Vectorized backtesting with realistic execution:

	- Takes model predictions and raw price data
	- Position sizing: bet % of current balance, accounting for full SL distance
	- `lot_value = balance * bet_pct / sl_distance`
	- Random slippage: uniform 0–2 XAUUSDc units applied to entry price
	- Spread filter: skip trade if `sl_distance < spread * 10`
	- 1:1 Risk-Reward: TP distance = SL distance
	- Walk forward bar-by-bar on test set, track equity curve
	- No trade limit — takes every valid signal
	- Records all trades with entry/exit prices, PnL, timestamps

	---

	### Metrics & Evaluation

	#### [NEW] [metrics.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/metrics.py)

	\| Metric \| Description \|
	\|--------\|-------------\|
	\| Win Rate \| % of trades closed at TP \|
	\| Average Win % \| Mean profit per winning trade as % of balance \|
	\| Average Loss % \| Mean loss per losing trade as % of balance \|
	\| Sharpe Ratio \| Annualized risk-adjusted return \|
	\| Sortino Ratio \| Downside-risk-adjusted return \|
	\| Max Drawdown \| Largest peak-to-trough equity decline \|
	\| Profit Factor \| Gross profit / Gross loss \|
	\| Start Equity \| Initial balance \|
	\| End Equity \| Final balance after all trades \|
	\| Total Trades \| Number of executed trades \|
	\| Avg Trade Duration \| Mean holding time in bars/minutes \|
	\| Daily PnL Stats \| Intraday mean, std, min, max PnL \|
	\| Calmar Ratio \| Annualized return / Max Drawdown \|
	\| Expectancy \| Average PnL per trade \|

	Outputs a formatted console report and saves to `results/report.txt`.

	---

	### CLI Entry Point

	#### [NEW] [main.py](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/main.py)

	Unified CLI with subcommands:

	```
	python main.py fetch # Fetch 1-year data from MT5
	python main.py train # Engineer features, label, train model
	python main.py backtest # Run backtest on test set
	python main.py evaluate # Print metrics report
	python main.py run # Full pipeline: fetch → train → backtest → evaluate
	```

	Uses `argparse` with clear help text.

	---

	### Project Files

	#### [NEW] [requirements.txt](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/requirements.txt)

	```
	MetaTrader5>=5.0.45
	lightgbm>=4.0.0
	pandas>=2.0.0
	numpy>=1.24.0
	scikit-learn>=1.3.0
	joblib>=1.3.0
	```

	#### [NEW] [LICENSE](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/LICENSE)

	MIT License, author: Rembrant Oyangoren Albeos, year: 2026.

	#### [NEW] [README.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/README.md)

	Professional README with badges (Python, License, LightGBM), project description, features list, quick start, architecture overview, and configuration reference. No emojis.

	#### [NEW] [GUIDE.md](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/GUIDE.md)

	Step-by-step usage guide with tables for all commands, parameters, and expected outputs.

	#### [NEW] [.gitignore](file:///c:/Users/User/Desktop/debugrem/ML-3m-trader/.gitignore)

	Standard Python gitignore plus `data/`, `results/`, `models/`, `*.pkl`.

	---

	## Verification Plan

	### Automated Tests

	1. Syntax validation — run `python -m py_compile <file>` on every `.py` file to confirm no syntax errors
	2. Import validation — run `python -c "import config; import features; import labeler; import model; import backtester; import metrics"` to confirm all modules load correctly
	3. Dry-run test — run `python main.py --help` to confirm CLI is functional

	### Manual Verification

	1. User runs `python main.py fetch` with MT5 open and logged in, confirms data CSV is created in `data/`
	2. User runs `python main.py run` for the full pipeline, reviews the metrics report output
	3. User inspects `results/report.txt` for the performance metrics