Crypto 15-Minute Direction Classifier
A time-series classification model that predicts whether Bitcoin (BTC/USDT) price will move up or down over the next 15-minute interval using multivariate historical market data.
Model Overview
| Attribute | Value |
|---|---|
| Task | Binary time-series classification |
| Target | BTC price direction in next 15 minutes (up=1, down=0) |
| Input | 60 minutes of multivariate OHLCV + technical indicators |
| Assets | BTC/USDT + ETH/USDT (cross-asset features) |
| Best Model | Logistic Regression on flattened windows |
| Dataset | 300K rows of 1-minute candles from WinkingFace CryptoLM datasets |
Performance
| Metric | Value |
|---|---|
| Test Accuracy | 53.1% |
| Test F1 | 0.574 |
| Test AUC | 0.540 |
Note: 15-minute crypto price direction prediction is an extremely hard problem due to market efficiency at short timeframes. The model consistently edges above random chance (50%), demonstrating a non-trivial but small signal. This pipeline is valuable as a complete data engineering and feature extraction system for further research.
Data Sources
- WinkingFace/CryptoLM-Bitcoin-BTC-USDT - BTC 1-min OHLCV + 15 technical indicators
- WinkingFace/CryptoLM-Ethereum-ETH-USDT - ETH 1-min OHLCV + 15 technical indicators
Features (49 per timestep)
BTC & ETH (separately)
- Price:
open,high,low,close - Volume:
volume - Moving Averages:
MA_20,MA_50,MA_200 - Momentum:
RSI,%K,%D,ADX,ATR - Trend:
MACD,Signal,Histogram,Trendline - Volatility:
BL_Upper,BL_Lower,MN_Upper,MN_Lower
Cross-Asset Engineered
eth_btc_ratio- ETH/BTC price ratiobtc_ret_1m,eth_ret_1m- 1-minute returnsbtc_vol_ma20,eth_vol_ma20- 20-period volume MAbtc_range,eth_range- Normalized price range
Pipeline
- Load & Merge BTC and ETH 1-minute datasets on timestamp
- Engineer Features - Add returns, ratios, ranges, volume MAs
- Create Windows - 60-minute lookback → predict next 15-minute direction
- Clean - Drop NaN/Inf, standardize per-feature
- Split - 70/15/15 temporal train/val/test (no data leakage)
- Train - Logistic Regression + Random Forest baselines
Usage
import pickle
import numpy as np
# Load model
with open("model.pkl", "rb") as f:
model = pickle.load(f)
# Load preprocessing artifacts
mean = np.load("feature_mean.npy")
std = np.load("feature_std.npy")
valid = np.load("valid_cols.npy")
# X shape: (samples, 60 minutes, 49 features)
X_flat = X.reshape(X.shape[0], -1) # flatten to 2940 features
X_flat = X_flat[:, valid] # keep valid columns
X_norm = (X_flat - mean) / std # standardize
# Predict
preds = model.predict(X_norm) # 0=down, 1=up
probs = model.predict_proba(X_norm)[:, 1] # probability of up
Files
| File | Description |
|---|---|
model.pkl |
Trained LogisticRegression classifier |
feature_mean.npy |
Per-feature means for standardization |
feature_std.npy |
Per-feature standard deviations |
valid_cols.npy |
Boolean mask of valid (finite) feature columns |
metrics.json |
Evaluation results |
Limitations
- Market Efficiency: 15-min prediction is near-random walk; edge is small
- No Costs: Evaluation ignores fees, slippage, spread
- Historical Data: Trained on 2017-2020 data; may not generalize to current regimes
- Simple Models: Deep learning (Conv-LSTM, TCN, Transformer) may improve results
Future Work
- Deep Learning: Conv-LSTM, Temporal CNN, or Transformer architectures
- More Data: Order book, funding rates, on-chain metrics, sentiment
- Multi-Scale: Combine 1-min, 5-min, 15-min, 1-hour features
- Regime Detection: Train separate models for bull/bear/sideways markets
- Cost-Aware Evaluation: Incorporate transaction costs in metric
License
MIT License
Generated by ML Intern
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support