SpecBridge / README.md
YinkaiW's picture
Add pipeline tag and link to code (#1)
e84bad8 verified
---
base_model:
- Derify/ChemBERTa_augmented_pubchem_13m
datasets:
- roman-bushuiev/MassSpecGym
license: mit
pipeline_tag: feature-extraction
tags:
- chemistry
- biology
---
# SpecBridge Adapter Weights (MSGYM / MSnLib / Spectraverse)
This repository contains the SpecBridge adapter checkpoints introduced in the paper [SpecBridge: Bridging Mass Spectrometry and Molecular Representations via Cross-Modal Alignment](https://huggingface.co/papers/2601.17204).
Each checkpoint is trained for **spectra → molecule retrieval** by aligning a DreaMS-conditioned spectrum representation to a frozen molecular embedding space (ChemBERTa), as described in the paper.
**Code:** [https://github.com/HassounLab/SpecBridge](https://github.com/HassounLab/SpecBridge)
---
## Files in this repo
| Dataset | Checkpoint filename |
|---|---|
| MassSpecGym (MSGYM) | `SpecBridge_MSGYM_checkpoint.pt` |
| MSnLib | `SpecBridge_MSnLib_checkpoint.pt` |
| Spectraverse | `SpecBridge_Spectraverse_checkpoint.pt` |
---
## Download & load weights
### Python (Hugging Face Hub)
```python
from huggingface_hub import hf_hub_download
import torch
repo_id = "YinkaiW/SpecBridge"
ckpt_path = hf_hub_download(
repo_id=repo_id,
filename="SpecBridge_MSGYM_checkpoint.pt",
)
state = torch.load(ckpt_path, map_location="cpu")
print(type(state))
```
> These checkpoints are intended to be used with the SpecBridge codebase (training/evaluation scripts) and a DreaMS SSL checkpoint.
---
## Citation
If you use these weights, please cite:
```bibtex
@misc{wang2026specbridge,
title = {SpecBridge: Bridging Mass Spectrometry and Molecular Representations via Cross-Modal Alignment},
author = {Yinkai Wang and Yan Zhou Chen and Xiaohui Chen and Li-Ping Liu and Soha Hassoun},
year = {2026},
eprint = {2601.17204},
archivePrefix = {arXiv},
primaryClass = {cs.LG},
doi = {10.48550/arXiv.2601.17204},
url = {https://arxiv.org/abs/2601.17204}
}
```