From Tokens to Blocks: A Block-Diffusion Perspective on Molecular Generation

Model Description

SoftMol is a unified framework for target-aware molecular generation that systematically co-designs representation, model architecture, and search strategy. It introduces the SoftBD (Soft-fragment Block-Diffusion) architecture, which is the first molecular block-diffusion language model. SoftMol synergizes intra-block bidirectional denoising with inter-block autoregressive conditioning.

  • Developer: Shenzhen University Artificial Intelligence Drug Design Research Group (SZU-ADDG)
  • Model Type: Block-Diffusion Language Model
  • Language(s): English (and SMILES/Chemical representations)
  • License: MIT
  • Related Paper: From Tokens to Blocks: A Block-Diffusion Perspective on Molecular Generation

Model Sources

Available Model Weights

We provide checkpoints for multiple model scales. The 89M model (89M-epoch6-best.ckpt) is the primary checkpoint used for the results reported in the paper.

  • 55M-epoch1-last.ckpt (config: small-50M.yaml)
  • 74M-epoch1-last.ckpt (config: small-70M.yaml)
  • 89M-epoch6-best.ckpt (config: small-89M.yaml) [Recommended]
  • 116M-epoch1-last.ckpt (config: small-110M.yaml)
  • 624M-epoch1-last.ckpt (config: large.yaml)

Training Details

Training Data

SoftMol is trained on the ZINC-Curated dataset, a carefully curated collection of molecules favored for high drug-likeness and synthetic accessibility. The dataset is available at SZU-ADDG/ZINC-Curated.

Hardware & Hyperparameters (89M Model)

  • Hardware: 8 × NVIDIA RTX 4090 GPUs
  • Precision: bf16-mixed
  • Global Batch Size: 1600
  • Attention Backend: SDPA
  • Steps: 1,334,000

Intended Use & Capabilities

1. De Novo Generation

For unconstrained molecule generation, SoftMol (SoftBD) can generate chemically valid and diverse molecules efficiently. In our experiments (using $K_{\text{sample}}=2$, $p=0.95$, $\tau=1.0$), SoftBD achieved 100% chemical validity.

2. Structure-Based Drug Design (SBDD)

SoftMol can generate ligands for specific protein targets (parp1, jka2, fa7, 5ht1b, braf) utilizing a gated MCTS (Monte Carlo Tree Search) mechanism. This mechanism explicitly decouples binding affinity optimization from drug-likeness constraints.

Performance & Results

Empirically, SoftMol resolves the trade-off between generation quality and efficiency. Compared to state-of-the-art methods:

  • 100% chemical validity
  • 6.6x speedup in inference efficiency
  • 9.7% improvement in binding affinity
  • 2-3x higher molecular diversity

How to Get Started with the Model

To use this model, please clone our GitHub repository and set up the environment.

Download weights dynamically:

from huggingface_hub import hf_hub_download

# Download the recommended 89M weight
model_path = hf_hub_download(repo_id="SZU-ADDG/SoftMol", filename="weights/89M-epoch6-best.ckpt")

# Download the corresponding config
config_path = hf_hub_download(repo_id="SZU-ADDG/SoftMol", filename="configs/model/small-89M.yaml")

Citation

If you use SoftMol or the ZINC-Curated dataset in your research, please cite our paper:

@article{yang2026tokens,
  title={From Tokens to Blocks: A Block-Diffusion Perspective on Molecular Generation},
  author={Yang, Qianwei and Xu, Dong and Yang, Zhangfan and Yuan, Sisi and Zhu, Zexuan and Li, Jianqiang and Ji, Junkai},
  journal={arXiv preprint arXiv:2601.21964},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train SZU-ADDG/SoftMol

Paper for SZU-ADDG/SoftMol