Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +142 -26

README.md CHANGED Viewed

@@ -1,40 +1,156 @@
-# Romeo (V5) — Model artifacts
-This folder contains the Romeo (V5) ensemble model artifacts for XAUUSD.
-Files
-- `trading_model_romeo_daily.pkl` — joblib artifact with tree models, weights, and canonical `features` list (artifact['features']).
-- `romeo_keras_daily.keras` — optional Keras model (if included during training).
-- `MODEL_CARD.md` — human-readable model card (detailed evaluation and transparency notes).
-- `metadata.json` — machine-readable metadata for the artifact (owner, tags, metrics, usage).
-Quick start
-1. Install dependencies: `pip install joblib tensorflow scikit-learn` (and `huggingface_hub` if using the Hub).
-2. Load the artifact:
 ```python
 import joblib
 artifact = joblib.load('trading_model_romeo_daily.pkl')
-features = artifact['features']
-# Prepare your data X with the same features order
 ```
-3. Predict (trees):
 ```python
-clf = artifact['models']['ensemble']  # adjust key name depending on training output
-proba = clf.predict_proba(X)
 ```
-4. Predict (keras):
-```python
-from tensorflow import keras
-model = keras.models.load_model('romeo_keras_daily.keras', compile=False)
-pred = model.predict(X_keras)
-```
-Evaluation
-- See `metadata.json` and the repo-level `README.md` for the most recent M2M metrics and robustness summary.
-Notes
-- The canonical feature list is embedded in the `.pkl` artifact. When using unseen data, align columns to that list and fill missing features with zeros to avoid shape mismatches.
-- The backtester that produced the evaluation uses per-bar M2M equity; position sizing is currently simple and may not reflect margin rules — exercise caution.

+---
+language: en
+license: mit
+library_name: sklearn
+tags:
+  - trading
+  - finance
+  - gold
+  - xauusd
+  - forex
+  - algorithmic-trading
+  - smart-money-concepts
+  - smc
+  - xgboost
+  - lightgbm
+  - machine-learning
+  - backtesting
+  - technical-analysis
+  - multi-timeframe
+  - intraday-trading
+  - high-frequency-trading
+  - ensemble-model
+  - keras
+  - tensorflow
+datasets:
+  - yahoo-finance-gc-f
+metrics:
+  - accuracy
+  - precision
+  - recall
+  - f1
+  - sharpe
+  - max_drawdown
+  - cagr
+  - win_rate
+model-index:
+  - name: romeo-v5-daily
+    results:
+      - task:
+          type: binary-classification
+          name: Daily Price Direction Prediction
+        dataset:
+          type: yahoo-finance-gc-f
+          name: Gold Futures (GC=F)
+        metrics:
+          - type: accuracy
+            value: 49.47
+            name: Win Rate
+          - type: sharpe
+            value: 0.3119
+            name: Sharpe Ratio
+          - type: max_drawdown
+            value: -47.66
+            name: Max Drawdown (%)
+          - type: cagr
+            value: 0.0444
+            name: CAGR
+---
+# Romeo V5 — Ensemble Trading Model for XAUUSD
+## Model Details
+### Model Description
+Romeo V5 is an ensemble machine learning model designed for predicting price movements in XAUUSD (Gold vs US Dollar) futures. It combines tree-based models (XGBoost and LightGBM) with an optional Keras neural network head to generate trading signals. The model outputs a probability score for long (up) trades, and the backtester handles entry/exit logic, position sizing, and risk management.
+- **Model Type**: Ensemble Classifier (XGBoost + LightGBM + optional Keras NN)
+- **Asset**: XAUUSD (Gold Futures)
+- **Strategy**: Smart Money Concepts (SMC) with technical indicators
+- **Prediction Horizon**: Daily timeframe (5-day ahead direction)
+- **Framework**: Scikit-learn, XGBoost, LightGBM, TensorFlow/Keras
+### Model Architecture
+- **Ensemble Components**:
+  - XGBoost Classifier: Gradient boosting on decision trees.
+  - LightGBM Classifier: Efficient gradient boosting with leaf-wise growth.
+  - Optional Keras Neural Network: Dense layers with custom `SumAxis1Layer` to replace anonymous Lambda for serialization.
+- **Features**: 31 canonical features including technical indicators (SMA, EMA, RSI, Bollinger Bands) and SMC elements (order blocks, volume profiles).
+- **Serialization**: Tree models saved in joblib `.pkl` format; Keras model in native `.keras` format.
+- **Weights**: Ensemble weights stored in artifact for weighted probability averaging.
+### Intended Use
+- **Primary Use**: Research, backtesting, and evaluation on historical XAUUSD data.
+- **Secondary Use**: Educational purposes for understanding ensemble trading models.
+- **Out-of-Scope**: Not financial advice. Do not use for live trading without proper validation, risk controls, and regulatory compliance.
+### Factors
+- **Relevant Factors**: Market volatility, economic indicators affecting gold prices (e.g., USD strength, inflation data).
+- **Evaluation Factors**: Tested on unseen data; robustness scanned across slippage, commission, and threshold parameters.
+### Metrics
+- **Evaluation Data**: Unseen daily data (out-of-sample).
+- **Metrics**:
+  - Initial Capital: 100
+  - Final Capital: 484.82
+  - CAGR: 0.0444
+  - Annual Volatility: 0.4118
+  - Sharpe Ratio: 0.3119
+  - Max Drawdown: -47.66%
+  - Total Trades: 3610
+  - Win Rate: 49.47%
+  - Avg PnL per Trade: 0.1066
+### Training Data
+- **Source**: Yahoo Finance (GC=F) historical data.
+- **Preprocessing**: Feature engineering with technical indicators and SMC concepts.
+- **Split**: Trained on historical data; evaluated on unseen fresh dataset.
+### Quantitative Analyses
+- **Robustness Scan**: Coarse grid sweep (slippage: 0-1 pips, commission: 0-0.0005, threshold: 0.5-0.6). Best scenarios: low friction, threshold ~0.5. Worst: high commission/threshold.
+- **M2M Equity**: Per-bar mark-to-market equity calculation for accurate risk metrics.
+### Ethical Considerations
+- **Bias**: Model trained on historical data; may not account for future market changes or black swan events.
+- **Risk**: High volatility in forex; potential for significant losses.
+- **Transparency**: Full disclosure of assumptions, limitations, and evaluation.
+### Caveats and Recommendations
+- **Limitations**: Simplified position sizing; small-account behavior may differ with margin rules. Historical backtests not indicative of future results.
+- **Recommendations**: Use with stop-loss, diversify, and consult financial advisors. Validate on your own data before use.
+## Usage
+### Loading the Model
 ```python
 import joblib
 artifact = joblib.load('trading_model_romeo_daily.pkl')
+features = artifact['features']  # Canonical feature list
+models = artifact['models']      # Dict of XGBoost/LightGBM models
+weights = artifact['weights']    # Ensemble weights
 ```
+### Making Predictions
 ```python
+import pandas as pd
+# Prepare df with features matching artifact['features']
+X = df[features].fillna(0)  # Fill missing features with 0
+probabilities = sum(weight * model.predict_proba(X)[:, 1] for model, weight in zip(models.values(), weights.values())) / sum(weights.values())
+signals = (probabilities > threshold).astype(int)  # threshold e.g. 0.5
 ```
+### Backtesting
+Use `v5/backtest_v5.py` with `--data <path>` to run on custom data. It aligns features automatically.
+### Requirements
+- Python 3.8+
+- scikit-learn, xgboost, lightgbm, tensorflow, joblib
+## Files
+- `trading_model_romeo_daily.pkl`: Main artifact.
+- `romeo_keras_daily.keras`: Optional Keras model.
+- `README.md`: This model card.
+- `metadata.json`: Structured metadata.
+## Contact
+For issues or contributions: https://github.com/JonusNattapong/AITradings-samsam