DemoDiff-0.7B / README.md
liuganghuggingface's picture
Improve model card: Add pipeline tag, paper, GitHub link, and description (#1)
c49f2a6 verified
---
datasets:
- liuganghuggingface/demodiff_downstream
license: mit
tags:
- chemistry
- biology
pipeline_tag: graph-ml
---
# DemoDiff: Graph Diffusion Transformers are In-Context Molecular Designers
This repository contains the DemoDiff model, a diffusion-based molecular foundation model for **in-context inverse molecular design**, as presented in the paper [Graph Diffusion Transformers are In-Context Molecular Designers](https://huggingface.co/papers/2510.08744).
DemoDiff leverages graph diffusion transformers to generate molecules based on contextual examples, enabling few-shot molecular design across diverse chemical tasks without task-specific fine-tuning. It introduces demonstration-conditioned diffusion models, which define task contexts using a small set of molecule-score examples instead of text descriptions to guide a denoising Transformer for molecule generation. A novel molecular tokenizer with Node Pair Encoding is developed for scalable pretraining, representing molecules at the motif level.
Code: https://github.com/liugangcode/DemoDiff
## ๐ŸŒŸ Key Features
- **In-Context Learning**: Generate molecules using only contextual examples (no fine-tuning required)
- **Graph-Based Tokenization**: Novel molecular graph tokenization with BPE-style vocabulary
- **Comprehensive Benchmarks**: 30+ downstream tasks covering drug discovery, docking, and polymer design
### Model Configuration
| Parameter | Value | Description |
|------------|--------|-------------|
| **context_length** | 150 | Maximum sequence length for the input context. |
| **depth** | 24 | Number of transformer layers. |
| **diffusion_steps** | 500 | Number of diffusion steps during training. |
| **hidden_size** | 1280 | Hidden dimension size in the transformer. |
| **mlp_ratio** | 4 | Expansion ratio in the MLP block. |
| **num_heads** | 16 | Number of attention heads. |
| **task_name** | `pretrain` | Task type for model training. |
| **tokenizer_name** | `pretrain` | Tokenizer used for model input. |
| **vocab_ring_len** | 300 | Length of the circular vocabulary window. |
| **vocab_size** | 3000 | Total vocabulary size. |