🌟 Supernova Peak Predictor

The first model to predict when and how bright a supernova will become from its earliest ZTF alert observations β€” with uncertainty estimates.

Given a single ZTF alert packet from a rising supernova, this model predicts:

  • days_to_peak β€” how many days until the supernova reaches maximum brightness
  • peakmag β€” the peak apparent magnitude it will achieve
  • 80% prediction intervals β€” uncertainty bounds via quantile regression

This enables astronomers to answer: "Should I point the telescope at this target tonight, or can it wait?"

πŸ”­ Try it live: Interactive Demo

Why this matters

The bottleneck in transient astronomy isn't detection β€” ZTF finds thousands of candidates per night. The bottleneck is follow-up telescope time. Spectroscopic observations are expensive and limited. Every night, astronomers must decide which of dozens of candidates to prioritize.

Currently, that decision is reactive: "Is this a supernova? Yes β†’ follow up." (BTSbot solves this with 98.5% accuracy.)

Our model makes it proactive: "This SN will peak at mag 17.8 in 5Β±3 days β†’ schedule it now" vs "This one won't peak for 3 months β†’ deprioritize."

With the Vera C. Rubin Observatory (LSST) coming online, alert rates will jump from ~100K/night to ~10M/night. Automated triage like this will be essential.

Performance

Evaluated with 5-fold grouped cross-validation (grouped by supernova object ID to prevent data leakage β€” no alerts from the same SN appear in both train and validation).

Overall (27,202 alerts from 3,806 supernovae)

Target MAE Median AE P90
days_to_peak 118.8 days 29.8 days 327.2 days
peakmag 0.257 mag 0.178 mag 0.536 mag

By number of prior detections

Detection stage n MAE days Median days MAE mag Median mag
1-3 (first catches) 4,817 70.8 16.5 0.391 0.304
4-10 (early rise) 9,683 114.4 26.7 0.273 0.197
11-50 (sampled rise) 11,029 132.4 36.3 0.199 0.147
50+ (monitored) 1,673 192.0 77.6 0.164 0.117

Key finding: The model is most useful on the hardest, most valuable cases β€” the first 1-3 detections β€” where median timing error is just 16.5 days and median magnitude error is 0.304 mag.

By true time-to-peak

Horizon n MAE days Median days MAE mag Median mag
Imminent (<7d) 12,694 66.2 24.9 0.330 0.229
Soon (7-30d) 10,431 62.9 20.7 0.183 0.150
Weeks (30-100d) 1,086 75.5 30.2 0.211 0.148
Distant (100d+) 2,991 552.5 433.0 0.229 0.164

The "soon" horizon (7-30 days) is the sweet spot β€” exactly the window where scheduling decisions matter most, and where the model achieves 0.150 mag median error.

Uncertainty Quantification

Quantile regression models (10th, 50th, 90th percentile) provide 80% prediction intervals:

Target Coverage Median Interval Width
days_to_peak 71.0% 40.9 days
peakmag 69.4% 0.49 mag

Coverage by detection stage:

Stage Days coverage Days width Mag coverage Mag width
1-3 detections 71.2% 42.8 days 69.9% 0.48 mag
4-10 detections 71.1% 40.3 days 69.8% 0.50 mag
11-50 detections 70.5% 41.4 days 69.0% 0.48 mag

Architecture

LightGBM gradient boosted trees on 49 engineered features extracted from ZTF alert metadata.

No images are used. We tested a ConvNeXt-pico CNN on the 63Γ—63 difference image triplets and found that metadata alone outperforms the full multimodal model (MAE mag 0.257 vs 0.271). The images add noise at this resolution. This is itself a useful finding β€” it means the predictor can run at alert-stream speed (microseconds per prediction, no GPU needed).

Top features (by LightGBM importance)

For days_to_peak: ncovhist (coverage history), distpsnr1 (distance to nearest PS1 source), distpsnr2, neargaia, maxmag_so_far

For peakmag: maggaia (Gaia magnitude), peakmag_so_far (brightest seen), sgscore1 (star/galaxy score), maxmag_so_far, ndethist

The host galaxy properties (PS1 colors, star/galaxy scores, distances) dominate the timing prediction β€” the model is learning that where a supernova lives (host type, distance, environment) constrains how it evolves.

Model files

File Description
model_days.pkl Point estimate model for days_to_peak
model_mag.pkl Point estimate model for peakmag
model_days_q10.pkl 10th percentile quantile model (days)
model_days_q50.pkl 50th percentile quantile model (days)
model_days_q90.pkl 90th percentile quantile model (days)
model_mag_q10.pkl 10th percentile quantile model (mag)
model_mag_q50.pkl 50th percentile quantile model (mag)
model_mag_q90.pkl 90th percentile quantile model (mag)
features.py Feature engineering code
model_info.json Feature columns, metrics, importance scores
quantile_results.json Calibration results for quantile models

What we tried that didn't work

Approach Result
ConvNeXt-pico CNN on 63Γ—63 ZTF image triplets + metadata 6-20% worse than metadata-only across all bins
Simple MLP (114K params) on 23 raw features Competitive but 2-5% worse than tree models with engineered features
Including age / days_since_peak as input features Creates direct data leakage (days_to_peak = age - days_since_peak)

Training data

  • Source: MultimodalUniverse/btsbot β€” ZTF Bright Transient Survey alerts
  • Filter: Rise-phase supernovae only (is_rise=True, is_SN=True)
  • Size: 27,202 alerts from 3,806 unique supernovae
  • Splits: 5-fold GroupKFold by object ID (no alert-level leakage)

Usage

Point estimates

import pickle, json
import numpy as np
from huggingface_hub import hf_hub_download

# Download model files
model_days = pickle.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_days.pkl"), "rb"))
model_mag = pickle.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_mag.pkl"), "rb"))
info = json.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_info.json")))

# Your ZTF alert metadata (example)
from features import engineer_features  # download features.py from this repo
alert = {"magpsf": 19.2, "sigmapsf": 0.15, "ndethist": 3, ...}  # ZTF alert fields
feats = engineer_features(alert)

# Predict
X = np.array([[feats[c] for c in info['feature_cols']]], dtype=np.float32)
days_pred = model_days.predict(X)[0]
mag_pred = model_mag.predict(X)[0]
print(f"Predicted: peak in {days_pred:.1f} days at magnitude {mag_pred:.2f}")

With uncertainty intervals

# Load quantile models
model_days_q10 = pickle.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_days_q10.pkl"), "rb"))
model_days_q90 = pickle.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_days_q90.pkl"), "rb"))
model_mag_q10 = pickle.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_mag_q10.pkl"), "rb"))
model_mag_q90 = pickle.load(open(hf_hub_download("hawthorneluke/supernova-peak-predictor", "model_mag_q90.pkl"), "rb"))

# 80% prediction intervals
days_lo, days_hi = model_days_q10.predict(X)[0], model_days_q90.predict(X)[0]
mag_lo, mag_hi = model_mag_q10.predict(X)[0], model_mag_q90.predict(X)[0]
print(f"Days to peak: {days_pred:.1f} [{days_lo:.1f}, {days_hi:.1f}]")
print(f"Peak magnitude: {mag_pred:.2f} [{mag_lo:.2f}, {mag_hi:.2f}]")

Interactive demo

Try it live: hawthorneluke/supernova-peak-predictor-demo

Limitations

  • Long-horizon predictions are poor. For SNe >100 days from peak, the MAE is 552 days. The model essentially can't predict these β€” they're rare, slow-evolving transients with ambiguous early signatures.
  • Prediction intervals are slightly under-covering. The 80% prediction intervals achieve ~70% coverage in cross-validation. A conformal calibration step would improve this.
  • ZTF-specific. Features are tied to ZTF alert schema. Adaptation to LSST/Rubin alerts would require feature remapping.
  • No spectroscopic type prediction. We predict timing and brightness but not SN type (Ia vs II vs Ibc). This would be a natural extension.

Citation

If you use this model, please cite the underlying data:

@article{rehemtulla2024btsbot,
  title={BTSbot: A Multi-modal Deep Learning Model for Automated Bright Transient Identification},
  author={Rehemtulla, Nabeel and others},
  journal={arXiv preprint arXiv:2401.15167},
  year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train hawthorneluke/supernova-peak-predictor

Space using hawthorneluke/supernova-peak-predictor 1

Paper for hawthorneluke/supernova-peak-predictor