PitchPredict xLSTM

A 20-million parameter xLSTM trained on nearly a decade of MLB pitch data to predict pitch sequences โ€” not just pitch type, but speed, spin, trajectory, plate location, and result.

Repository: baseball-analytica/pitchpredict

Overview

Each pitch is encoded as 16 tokens covering type, speed, spin rate, spin axis, release point, initial velocity, acceleration, plate position, and result. The model processes sequences of these tokens alongside 27 context variables (pitcher/batter IDs, count, outs, bases, score, inning, etc.) to predict what comes next.

The model uses pitcher sessions (all pitches from when a pitcher enters to when they leave) as its sequence unit, giving it access to cross-at-bat patterns in a pitcher's outing.

Performance

Metric Value
Test Loss 0.8631
Top-1 Accuracy 65.81%
Top-5 Accuracy 97.90%
ECE (Calibration Error) 0.013

The model is 37.3 pp above the best baseline (most-frequent token). It is also well-calibrated โ€” when it says 80% confident, it's right ~80% of the time.

Accuracy varies by what's being predicted: forward velocity reaches 95%, pitch type sits at 53%, and plate location bottoms out at 23%. The model correctly learned which aspects of pitching are mechanical, which are strategic, and which are irreducibly noisy.

Architecture

Parameter Value
d_model 384
num_blocks 12
num_heads 8
vocab_size 258
seq_len 512
Total Parameters ~20M

The architecture is a custom xLSTM with a context adapter that fuses player embeddings, game state embeddings, and continuous features with the token sequence. See the repo for full implementation details.

Training

Trained on 113.9M tokens across 6.8M pitches from ~500K pitcher sessions (April 2016 -- October 2025, via Statcast). Hardware was 6x RTX 4090 with DDP and BF16 mixed precision. This checkpoint is from step 73,000, selected by minimum validation loss.

Usage

pip install pitchpredict
from pitchpredict import PitchPredictAPI

api = PitchPredictAPI()
result = await api.predict_pitcher(request)

The checkpoint is downloaded automatically from this repo on first use. See the pitchpredict documentation for API details.

Files

  • model.safetensors โ€” Model weights (safetensors format, fp32)
  • config.json โ€” Model hyperparameters

Authors

Part of the baseball-analytica project.

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support