Update README.md

daacf98 verified about 1 month ago

4.17 kB

	---
	library_name: pytorch
	license: cc-by-4.0
	tags:
	- astronomy
	- spectroscopy
	- be-stars
	- self-supervised
	- masked-autoencoder
	- time-series
	datasets:
	- anonym-submit-26/bess-bench-26
	---

	# BeMAE-Hα — Masked Autoencoder Encoder for Be-Star Hα Spectra

	`anonym-submit-26/bemae-halpha-v1` is the pretrained encoder of the BeMAE-Hα model
	introduced in *"BESS-Bench: Benchmarking Spectral Representations for
	Be-Star Variability"* (NeurIPS 2026 D&B Track, under review).

	It is a small (≈803\,k params) Transformer encoder that maps a 128-bin spectrum
	cropped around the Hα line (6562.8 ± 50 Å) to a 128-dimensional embedding
	`z_halpha`. The encoder was trained with a Masked Autoencoder (MAE) objective
	on the Hα slice of BESS-Bench (≈26.9\,k spectra over 1 468 Be stars,
	1990–2025).

	## Quick start

	```python
	from huggingface_hub import snapshot_download
	import sys, json, torch

	ckpt = snapshot_download("anonym-submit-26/bemae-halpha-v1")
	sys.path.insert(0, ckpt)
	from model import SpectralEncoderHalpha, ModelConfig # noqa: E402

	with open(f"{ckpt}/config.json") as f:
	cfg_dict = json.load(f)["model_config"]
	cfg = ModelConfig(**cfg_dict)

	encoder = SpectralEncoderHalpha(cfg)
	encoder.load_state_dict(torch.load(f"{ckpt}/pytorch_model.bin", map_location="cpu"))
	encoder.eval()

	# flux, wavelengths, validity : torch.Tensor of shape [B, 128]
	# Spectra must be cropped to Hα ± 50 Å (6512.8–6612.8 Å) and normalized
	# to the pseudo-continuum at the window edges (see `example_usage.py`).
	with torch.no_grad():
	z_halpha, *_ = encoder(flux, wavelengths, validity, mask=None)
	# z_halpha : Tensor of shape [B, 128]
	```

	## Intended use

	`z_halpha` is a general-purpose embedding of the Hα profile. It has been
	evaluated on the three downstream tasks of BESS-Bench v1.0 — SpecProbe
	(Hα feature regression), LineTransfer (Hβ → Hα generalisation) and
	EWForecast (short-horizon EW(Hα) forecasting); see the paper for details.

	Typical uses :
	- Clustering and retrieval of similar Hα profiles across the BeSS database.
	- Building lightweight regressors on top of `z_halpha` for physical
	parameters (EW, V/R, velocity).
	- Time-series features for short-horizon EW(Hα) forecasting (cf. EWForecast).

	## Training details

	\| Item \| Value \|
	\|---\|---\|
	\| Architecture \| Transformer MAE encoder \|
	\| Input \| 128 bins, Hα ± 50 Å (6512.8–6612.8 Å) \|
	\| Patches \| 8 px, step 4 → 31 patches \|
	\| Embedding dim \| 128 \|
	\| Transformer layers / heads \| 4 / 4 \|
	\| Parameters \| ≈803\,k (encoder only; full BeMAE auto-encoder ≈912\,k) \|
	\| Pretraining objective \| Masked Autoencoder, mask ratio 0.60, 3 contiguous blocks \|
	\| Optimizer \| AdamW, lr=0.0001, wd=0.05, warmup 5 epochs \|
	\| Epochs \| 80 (with early stopping) \|
	\| Batch size \| 256 \|
	\| Dataset \| `anonym-submit-26/bess-bench-26`, Hα slice (≈26.9\,k spectra) \|
	\| Seed (this checkpoint) \| 42 \|
	\| Best val MAE loss \| 0.335420 \|
	\| Hardware \| 1× NVIDIA A100 80 GB, ~30 min per seed \|

	## Reproducibility

	The v1.0 benchmark reports aggregated results over 3 seeds (42, 123, 456).
	This specific checkpoint corresponds to seed 42, which is the
	"official" reference run used for all qualitative plots in the paper.
	Multi-seed aggregates are available in the companion repository.

	To retrain from scratch (~1h30 on an A100) :

	```bash
	git clone https://github.com/anonym-submit-26/bess-bench-26.git
	cd bess-bench-26
	sbatch scripts/cluster/00_pretrain_encoder.slurm # trains seeds 42, 123, 456 sequentially
	```

	## Citation

	```bibtex
	@misc{bess_bench_2026bess,
	title = {BESS-Bench: Benchmarking Spectral Representations for Be-Star Variability},
	author = {Anonymous and others},
	year = {2026},
	note = {NeurIPS 2026 Datasets and Benchmarks Track}
	}
	```

	## Limitations

	- Trained exclusively on BeSS amateur-grade spectra; transfer to professional
	instruments (ESPRESSO, CARMENES) has not been evaluated.
	- The input window is fixed at Hα ± 50 Å; wider profiles may be truncated.
	- `z_halpha` is not a physical parameter vector; linear probes are required
	for interpretable regression (see SpecProbe and LineTransfer in the paper).