Crypto 15-Minute Direction Classifier

A time-series classification model that predicts whether Bitcoin (BTC/USDT) price will move up or down over the next 15-minute interval using multivariate historical market data.

Model Overview

Attribute	Value
Task	Binary time-series classification
Target	BTC price direction in next 15 minutes (up=1, down=0)
Input	60 minutes of multivariate OHLCV + technical indicators
Assets	BTC/USDT + ETH/USDT (cross-asset features)
Best Model	Logistic Regression on flattened windows
Dataset	300K rows of 1-minute candles from WinkingFace CryptoLM datasets

Performance

Metric	Value
Test Accuracy	53.1%
Test F1	0.574
Test AUC	0.540

Note: 15-minute crypto price direction prediction is an extremely hard problem due to market efficiency at short timeframes. The model consistently edges above random chance (50%), demonstrating a non-trivial but small signal. This pipeline is valuable as a complete data engineering and feature extraction system for further research.

Data Sources

WinkingFace/CryptoLM-Bitcoin-BTC-USDT - BTC 1-min OHLCV + 15 technical indicators
WinkingFace/CryptoLM-Ethereum-ETH-USDT - ETH 1-min OHLCV + 15 technical indicators

Features (49 per timestep)

BTC & ETH (separately)

Price: open, high, low, close
Volume: volume
Moving Averages: MA_20, MA_50, MA_200
Momentum: RSI, %K, %D, ADX, ATR
Trend: MACD, Signal, Histogram, Trendline
Volatility: BL_Upper, BL_Lower, MN_Upper, MN_Lower

Cross-Asset Engineered

eth_btc_ratio - ETH/BTC price ratio
btc_ret_1m, eth_ret_1m - 1-minute returns
btc_vol_ma20, eth_vol_ma20 - 20-period volume MA
btc_range, eth_range - Normalized price range

Pipeline

Load & Merge BTC and ETH 1-minute datasets on timestamp
Engineer Features - Add returns, ratios, ranges, volume MAs
Create Windows - 60-minute lookback → predict next 15-minute direction
Clean - Drop NaN/Inf, standardize per-feature
Split - 70/15/15 temporal train/val/test (no data leakage)
Train - Logistic Regression + Random Forest baselines

Usage

import pickle
import numpy as np

# Load model
with open("model.pkl", "rb") as f:
    model = pickle.load(f)

# Load preprocessing artifacts
mean = np.load("feature_mean.npy")
std = np.load("feature_std.npy")
valid = np.load("valid_cols.npy")

# X shape: (samples, 60 minutes, 49 features)
X_flat = X.reshape(X.shape[0], -1)      # flatten to 2940 features
X_flat = X_flat[:, valid]               # keep valid columns
X_norm = (X_flat - mean) / std            # standardize

# Predict
preds = model.predict(X_norm)            # 0=down, 1=up
probs = model.predict_proba(X_norm)[:, 1]  # probability of up

Files

File	Description
`model.pkl`	Trained LogisticRegression classifier
`feature_mean.npy`	Per-feature means for standardization
`feature_std.npy`	Per-feature standard deviations
`valid_cols.npy`	Boolean mask of valid (finite) feature columns
`metrics.json`	Evaluation results

Limitations

Market Efficiency: 15-min prediction is near-random walk; edge is small
No Costs: Evaluation ignores fees, slippage, spread
Historical Data: Trained on 2017-2020 data; may not generalize to current regimes
Simple Models: Deep learning (Conv-LSTM, TCN, Transformer) may improve results

Future Work

Deep Learning: Conv-LSTM, Temporal CNN, or Transformer architectures
More Data: Order book, funding rates, on-chain metrics, sentiment
Multi-Scale: Combine 1-min, 5-min, 15-min, 1-hour features
Regime Detection: Train separate models for bull/bear/sideways markets
Cost-Aware Evaluation: Incorporate transaction costs in metric

License

MIT License

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Try ML Intern: https://smolagents-ml-intern.hf.space
Source code: https://github.com/huggingface/ml-intern

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support