BeMAE-Hα — Masked Autoencoder Encoder for Be-Star Hα Spectra

anonym-submit-26/bemae-halpha-v1 is the pretrained encoder of the BeMAE-Hα model introduced in "BESS-Bench: Benchmarking Spectral Representations for Be-Star Variability" (NeurIPS 2026 D&B Track, under review).

It is a small (≈803,k params) Transformer encoder that maps a 128-bin spectrum cropped around the Hα line (6562.8 ± 50 Å) to a 128-dimensional embedding z_halpha. The encoder was trained with a Masked Autoencoder (MAE) objective on the Hα slice of BESS-Bench (≈26.9,k spectra over 1 468 Be stars, 1990–2025).

Quick start

from huggingface_hub import snapshot_download
import sys, json, torch

ckpt = snapshot_download("anonym-submit-26/bemae-halpha-v1")
sys.path.insert(0, ckpt)
from model import SpectralEncoderHalpha, ModelConfig  # noqa: E402

with open(f"{ckpt}/config.json") as f:
    cfg_dict = json.load(f)["model_config"]
cfg = ModelConfig(**cfg_dict)

encoder = SpectralEncoderHalpha(cfg)
encoder.load_state_dict(torch.load(f"{ckpt}/pytorch_model.bin", map_location="cpu"))
encoder.eval()

# flux, wavelengths, validity : torch.Tensor of shape [B, 128]
# Spectra must be cropped to Hα ± 50 Å (6512.8–6612.8 Å) and normalized
# to the pseudo-continuum at the window edges (see `example_usage.py`).
with torch.no_grad():
    z_halpha, *_ = encoder(flux, wavelengths, validity, mask=None)
# z_halpha : Tensor of shape [B, 128]

Intended use

z_halpha is a general-purpose embedding of the Hα profile. It has been evaluated on the three downstream tasks of BESS-Bench v1.0 — SpecProbe (Hα feature regression), LineTransfer (Hβ → Hα generalisation) and EWForecast (short-horizon EW(Hα) forecasting); see the paper for details.

Typical uses :

  • Clustering and retrieval of similar Hα profiles across the BeSS database.
  • Building lightweight regressors on top of z_halpha for physical parameters (EW, V/R, velocity).
  • Time-series features for short-horizon EW(Hα) forecasting (cf. EWForecast).

Training details

Item Value
Architecture Transformer MAE encoder
Input 128 bins, Hα ± 50 Å (6512.8–6612.8 Å)
Patches 8 px, step 4 → 31 patches
Embedding dim 128
Transformer layers / heads 4 / 4
Parameters ≈803,k (encoder only; full BeMAE auto-encoder ≈912,k)
Pretraining objective Masked Autoencoder, mask ratio 0.60, 3 contiguous blocks
Optimizer AdamW, lr=0.0001, wd=0.05, warmup 5 epochs
Epochs 80 (with early stopping)
Batch size 256
Dataset anonym-submit-26/bess-bench-26, Hα slice (≈26.9,k spectra)
Seed (this checkpoint) 42
Best val MAE loss 0.335420
Hardware 1× NVIDIA A100 80 GB, ~30 min per seed

Reproducibility

The v1.0 benchmark reports aggregated results over 3 seeds (42, 123, 456). This specific checkpoint corresponds to seed 42, which is the "official" reference run used for all qualitative plots in the paper. Multi-seed aggregates are available in the companion repository.

To retrain from scratch (~1h30 on an A100) :

git clone https://github.com/anonym-submit-26/bess-bench-26.git
cd bess-bench-26
sbatch scripts/cluster/00_pretrain_encoder.slurm   # trains seeds 42, 123, 456 sequentially

Citation

@misc{bess_bench_2026bess,
  title        = {BESS-Bench: Benchmarking Spectral Representations for Be-Star Variability},
  author       = {Anonymous and others},
  year         = {2026},
  note         = {NeurIPS 2026 Datasets and Benchmarks Track}
}

Limitations

  • Trained exclusively on BeSS amateur-grade spectra; transfer to professional instruments (ESPRESSO, CARMENES) has not been evaluated.
  • The input window is fixed at Hα ± 50 Å; wider profiles may be truncated.
  • z_halpha is not a physical parameter vector; linear probes are required for interpretable regression (see SpecProbe and LineTransfer in the paper).
Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train anonym-submit-26/bemae-halpha-v1