zanegraper
/

lgbm-crypto-ev-entry

quantitative-finance

Model card Files Files and versions

lgbm-crypto-ev-entry / README.md

zanegraper's picture

Upload README.md with huggingface_hub

1c681ea verified about 1 month ago

|

history blame contribute delete

2.66 kB

	---
	language: en
	license: mit
	tags:
	- finance
	- trading
	- cryptocurrency
	- lightgbm
	- tabular
	- time-series
	- quantitative-finance
	---

	# LGBM Crypto Expected-Value Entry Classifier

	## Overview

	This model is a LightGBM-based binary classifier trained to identify high-probability long entry points in cryptocurrency markets based on engineered OHLCV features.

	The model outputs a probability representing whether a trade has positive expected value over a fixed future horizon, given current market conditions.

	It is designed as an entry signal component, not a full trading system.

	---

	## Intended Use

	- Identifying high-confidence trade entry points
	- Research into ML-driven alpha signals
	- Use as a signal input for rule-based or reinforcement-learning trading systems
	- Educational and experimental quantitative finance projects

	Not intended for:
	- Direct execution without risk management
	- Standalone portfolio management
	- Live trading without additional validation

	---

	## Data

	- Assets: BTC_USDT, ETH_USDT (Binance spot)
	- Frequency: 1-minute OHLCV bars
	- Time period: Historical Binance data (multi-year)
	- Source: Public Binance data via CryptoDataDownload

	---

	## Features (high-level)

	The model uses engineered, asset-agnostic features including:

	- Log returns over multiple horizons
	- Rolling volatility estimates
	- Moving averages and trend slopes
	- ATR-based volatility
	- Volume and trade-count z-scores

	All features are computed using only past information (no leakage).

	---

	## Labels

	The target label represents whether a hypothetical long trade achieves positive expected value over a fixed future horizon, accounting for transaction costs.

	This is not a directional price prediction.

	---

	## Model Details

	- Model type: LightGBM Gradient Boosted Trees
	- Objective: Binary classification (expected value > 0)
	- Loss: Binary log loss
	- Training style: Time-based train/validation split
	- Evaluation: AUC, log loss, walk-forward backtests

	---

	## Performance Summary

	Typical validation metrics (varies by window):

	- AUC: ~0.55
	- Log loss: ~0.68

	Despite modest AUC, the model demonstrates positive expectancy when thresholded, consistent with real-world trading signals.

	---

	## Usage Example

	```python
	import joblib
	import pandas as pd

	bundle = joblib.load("lgbm_ev_classifier.joblib")
	model = bundle["model"]
	feature_cols = bundle["feature_cols"]

	# df must already contain engineered features
	df["prob"] = model.predict(df[feature_cols])