BioMedGPT-Mol
BioMedGPT-Mol is a multimodal molecular language model jointly released by PharMolix Inc. and the Institute of AI Industry Research (AIR), Tsinghua University. It is built for both molecular understanding and generation, supporting a wide range of tasks including chemical name conversion, molecular captioning, property prediction, reaction modeling, molecule editing, and property optimization. Trained with a well-structured multi-task curriculum, BioMedGPT-Mol shows remarkable performance across diverse molecule-centric discovery benchmarks. More technical details can be found in the technical report.
Get started
Download the model and config files.
Evaluation on Benchmarks
- The test set is available in testset.
If you use the dataset for evaluation, please consider citing the related papers:
@article{yu2024llasmol, title={Llasmol: Advancing large language models for chemistry with a large-scale, comprehensive, high-quality instruction tuning dataset}, author={Yu, Botao and Baker, Frazier N and Chen, Ziqi and Ning, Xia and Sun, Huan}, journal={arXiv preprint arXiv:2402.09391}, year={2024} } @article{li2024tomg, title={TOMG-Bench: Evaluating LLMs on text-based open molecule generation}, author={Li, Jiatong and Li, Junxian and Liu, Yunqing and Zhou, Dongzhan and Li, Qing}, journal={arXiv preprint arXiv:2412.14642}, year={2024} } @article{dey2025mathtt, title={$$\backslash$mathtt $\{$GeLLM\^{} 3O$\}$ $: Generalizing Large Language Models for Multi-property Molecule Optimization}, author={Dey, Vishal and Hu, Xiao and Ning, Xia}, journal={arXiv preprint arXiv:2502.13398}, year={2025} } @article{biomedgpt-mol, title={BioMedGPT-Mol: Multi-task Learning for Molecular Understanding and Generation}, author={Zuo, Chenyang and Fan, Siqi and Nie, Zaiqing}, journal={arXiv preprint arXiv:2512.04629}, year={2025} } - Update the configuration and run inference using the provided scripts, and the outputs will be saved in the
logsdirectory.- logs ---- biomedgpt_mol -------- mumoinstruct ------------ logs ------------ results -------- openmolinst ------------ logs ------------ results -------- smolinstruct ------------ logs ------------ results# SMolInstruction bash evaluation/scripts/inference_smolinstruct.sh # OpenMolInstuct bash evaluation/scripts/inference_openmolinst.sh # MuMoInstruct bash evaluation/scripts/inference_mumoinstruct.sh - Update the configuration accordingly and execute the evaluation scripts. The computed metrics will be stored as
metrics.jsonin the results directory, e.g.,/logs/biomedgpt_mol/mumoinstruct/results/metrics.json.# SMolInstruction bash evaluation/scripts/evaluate_smolinstruct.sh # OpenMolInstuct bash evaluation/scripts/evaluate_openmolinst.sh # MuMoInstruct bash evaluation/scripts/evaluate_mumoinstruct.sh
- The test set is available in testset.
If you use the dataset for evaluation, please consider citing the related papers:
🔥Explore our OpenBioMed platform for more discovery tasks.
Cite Us
If you find our open-sourced models helpful to your research, please consider citing:
@article{biomedgpt-mol,
title={BioMedGPT-Mol: Multi-task Learning for Molecular Understanding and Generation},
author={Zuo, Chenyang and Fan, Siqi and Nie, Zaiqing},
journal={arXiv preprint arXiv:2512.04629},
year={2025}
}
- Downloads last month
- 15
