SpecBridge / README.md
YinkaiW's picture
Add pipeline tag and link to code (#1)
e84bad8 verified
metadata
base_model:
  - Derify/ChemBERTa_augmented_pubchem_13m
datasets:
  - roman-bushuiev/MassSpecGym
license: mit
pipeline_tag: feature-extraction
tags:
  - chemistry
  - biology

SpecBridge Adapter Weights (MSGYM / MSnLib / Spectraverse)

This repository contains the SpecBridge adapter checkpoints introduced in the paper SpecBridge: Bridging Mass Spectrometry and Molecular Representations via Cross-Modal Alignment.

Each checkpoint is trained for spectra → molecule retrieval by aligning a DreaMS-conditioned spectrum representation to a frozen molecular embedding space (ChemBERTa), as described in the paper.

Code: https://github.com/HassounLab/SpecBridge


Files in this repo

Dataset Checkpoint filename
MassSpecGym (MSGYM) SpecBridge_MSGYM_checkpoint.pt
MSnLib SpecBridge_MSnLib_checkpoint.pt
Spectraverse SpecBridge_Spectraverse_checkpoint.pt

Download & load weights

Python (Hugging Face Hub)

from huggingface_hub import hf_hub_download
import torch

repo_id = "YinkaiW/SpecBridge"

ckpt_path = hf_hub_download(
    repo_id=repo_id,
    filename="SpecBridge_MSGYM_checkpoint.pt",
)

state = torch.load(ckpt_path, map_location="cpu")
print(type(state))

These checkpoints are intended to be used with the SpecBridge codebase (training/evaluation scripts) and a DreaMS SSL checkpoint.


Citation

If you use these weights, please cite:

@misc{wang2026specbridge,
  title         = {SpecBridge: Bridging Mass Spectrometry and Molecular Representations via Cross-Modal Alignment},
  author        = {Yinkai Wang and Yan Zhou Chen and Xiaohui Chen and Li-Ping Liu and Soha Hassoun},
  year          = {2026},
  eprint        = {2601.17204},
  archivePrefix = {arXiv},
  primaryClass  = {cs.LG},
  doi           = {10.48550/arXiv.2601.17204},
  url           = {https://arxiv.org/abs/2601.17204}
}