hhhhhhh789 commited on
Commit
1a532e4
·
verified ·
1 Parent(s): 7743c49

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -31
README.md CHANGED
@@ -1,3 +1,6 @@
 
 
 
1
  <div align='center'>
2
  <h1> MolCRAFT Series for Drug Design </h1>
3
 
@@ -68,49 +71,111 @@ The MolCRAFT series addresses critical challenges in generative models for SBDD,
68
  * Achieves 95.9% PoseBusters passing rate on CrossDock with significantly improved molecular geometries.
69
 
70
  ---
 
 
71
 
72
- ## ⚙️ Installation & Usage
73
 
74
- Please refer to the `README.md` file within each project's subdirectory for specific instructions on installation, dependencies (docker recommended), and how to run the code.
75
 
76
- ---
 
77
 
78
- ## 📊 Datasets and Benchmarks
79
 
80
- Our models are evaluated on standard benchmarks in the field, such as:
 
81
 
82
- * **CrossDocked2020**: Used for evaluating binding affinity, molecular validity, and optimization success rates.
83
- * **PoseBusters V2**: Used for assessing the quality of generated molecular poses.
 
84
 
85
- Details about the specific datasets used for training and evaluation can be found in the respective publications and project READMEs.
86
 
87
- ---
 
 
 
 
 
88
 
89
- ## 🤝 Contributing
90
 
91
- We welcome contributions to the MolCRAFT series! If you are interested in contributing, please feel free to fork the repository, make your changes, and submit a pull request. You can also open an issue if you find any bugs or have suggestions for improvements.
 
 
 
 
 
 
 
 
 
92
 
93
  ---
 
 
 
 
 
94
 
95
- ## 📝 Citation
 
 
 
96
 
97
- If you use any of the methods or code from this repository in your research, please cite the respective papers:
 
 
 
 
98
 
99
- ```bibtex
100
- @article{qiu2025piloting,
101
- title={Piloting Structure-Based Drug Design via Modality-Specific Optimal Schedule},
102
- author={Qiu, Keyue and Song, Yuxuan and Fan, Zhehuan and Liu, Peidong and Zhang, Zhe and Zheng, Mingyue and Zhou, Hao and Ma, Wei-Ying},
103
- journal={ICML 2025},
104
- year={2025}
105
- }
106
 
107
- @article{qiu2025empower,
108
- title={Empower Structure-Based Molecule Optimization with Gradient Guidance},
109
- author={Qiu, Keyue and Song, Yuxuan and Yu, Jie and Ma, Hongbo and Cao, Ziyao and Zhang, Zhilong and Wu, Yushuai and Zheng, Mingyue and Zhou, Hao and Ma, Wei-Ying},
110
- journal={ICML 2025},
111
- year={2025}
112
- }
 
 
 
113
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
  @article{qu2024molcraft,
115
  title={MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space},
116
  author={Qu, Yanru and Qiu, Keyue and Song, Yuxuan and Gong, Jingjing and Han, Jiawei and Zheng, Mingyue and Zhou, Hao and Ma, Wei-Ying},
@@ -124,8 +189,4 @@ If you use any of the methods or code from this repository in your research, ple
124
  journal={ICLR 2024},
125
  year={2024}
126
  }
127
- ```
128
-
129
- ## 📄 License
130
-
131
- The project is licensed under the terms of the CC-BY-NC-SA license. See [LICENSE](https://github.com/algomole/MolCRAFT/blob/main/LICENSE) for more details.
 
1
+ ---
2
+ license: cc-by-nc-sa-2.0
3
+ ---
4
  <div align='center'>
5
  <h1> MolCRAFT Series for Drug Design </h1>
6
 
 
71
  * Achieves 95.9% PoseBusters passing rate on CrossDock with significantly improved molecular geometries.
72
 
73
  ---
74
+ # MolCRAFT
75
+ Official implementation of ICML 2024 ["MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space"](https://arxiv.org/abs/2404.12141).
76
 
77
+ 🎉 Our demo is now available [here](http://61.241.63.126:8000). Welcome to have a try!
78
 
79
+ ![](../asset/molcraft_framework.png)
80
 
81
+ ## Environment
82
+ It is highly recommended to install via docker if a Linux server with NVIDIA GPU is available.
83
 
84
+ Otherwise, you might check [README for env](docker/README.md) for further details of docker or conda setup.
85
 
86
+ ### Prerequisite
87
+ A docker with `nvidia-container-runtime` enabled on your Linux system is required.
88
 
89
+ > [!TIP]
90
+ > - This repo provides an easy-to-use script to install docker and nvidia-container-runtime, in `./docker` run `sudo ./setup_docker_for_host.sh` to set up your host machine.
91
+ > - For details, please refer to the [install guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
92
 
 
93
 
94
+ ### Install via Docker
95
+ We highly recommend you to set up the environment via docker, since all you need to do is a simple `make` command.
96
+ ```bash
97
+ cd ./docker
98
+ make
99
+ ```
100
 
 
101
 
102
+ -----
103
+ ## Data
104
+ We use the same data as [TargetDiff](https://github.com/guanjq/targetdiff/tree/main?tab=readme-ov-file#data). Data used for training / evaluating the model should be put in the `data` folder by default, and accessible in the [data](https://drive.google.com/drive/folders/1j21cc7-97TedKh_El5E34yI8o5ckI7eK?usp=share_link) Google Drive folder.
105
+
106
+ To train the model from scratch, download the lmdb file and split file into data folder:
107
+ * `crossdocked_v1.1_rmsd1.0_pocket10_processed_final.lmdb`
108
+ * `crossdocked_pocket10_pose_split.pt`
109
+
110
+ To evaluate the model on the test set, download _and_ unzip the `test_set.zip` into data folder. It includes the original PDB files that will be used in Vina Docking.
111
+
112
 
113
  ---
114
+ ## Training
115
+ Run `make -f scripts.mk` (without the need for data preparation), or alternatively (with data folder correctly configured),
116
+ ```bash
117
+ python train_bfn.py --exp_name ${EXP_NAME} --revision ${REVISION}
118
+ ```
119
 
120
+ where the default values should be set the same as:
121
+ ```bash
122
+ python train_bfn.py --sigma1_coord 0.03 --beta1 1.5 --lr 5e-4 --time_emb_dim 1 --epochs 15 --max_grad_norm Q --destination_prediction True --use_discrete_t True --num_samples 10 --sampling_strategy end_back_pmf
123
+ ```
124
 
125
+ ### Testing
126
+ For quick evaluation of the official checkpoint, refer to `make evaluate` in `scripts.mk`:
127
+ ```bash
128
+ python train_bfn.py --test_only --no_wandb --ckpt_path ./checkpoints/${CKPT_NAME}
129
+ ```
130
 
131
+ ### Debugging
132
+ For quick debugging training process, run `make debug -f scripts.mk`:
133
+ ```bash
134
+ python train_bfn.py --no_wandb --debug --epochs 1
135
+ ```
 
 
136
 
137
+ ## Sampling
138
+ We provide the pretrained MolCRAFT checkpoint [here](https://drive.google.com/file/d/1TcUQM7Lw1klH2wOVBu20cTsvBTcC1WKu/view?usp=share_link).
139
+
140
+
141
+ ### Sampling for pockets in the testset
142
+ Run `make evaluate -f scripts.mk`, or alternatively,
143
+ ```bash
144
+ python train_bfn.py --config_file configs/default.yaml --exp_name ${EXP_NAME} --revision ${REVISION} --test_only --num_samples ${NUM_MOLS_PER_POCKET} --sample_steps 100
145
+ ```
146
 
147
+ The output molecules `vina_docked.pt` for all 100 test pockets will be saved in `./logs/${USER}_bfn_sbdd/${EXP_NAME}/${REVISION}/test_outputs/${TIMESTAMP}` folders.
148
+
149
+ ### Sampling from pdb file
150
+ To sample from a whole protein pdb file, we need the corresponding reference ligand to clip the protein pocket (a 10A region around the reference position).
151
+
152
+ Below is an example that stores the generated 10 molecules under `output` folder. The configurations are managed in the ``call()`` function of ``sample_for_pocket.py``.
153
+
154
+ ```bash
155
+ python sample_for_pocket.py ${PDB_PATH} ${SDF_PATH}
156
+ ```
157
+
158
+ ## Evaluation
159
+ ### Evaluating molecules
160
+ For binding affinity (Vina Score / Min / Dock) and molecular properties (QED, SA), it is calculated upon sampling.
161
+
162
+ For PoseCheck (strain energy, clashes) and other conformational results (bond length, bond angle, torsion angle, RMSD), please refer to `test` folder.
163
+
164
+ ### Evaluating meta files
165
+ We provide samples for all SBDD baselines in the [sample](https://drive.google.com/drive/folders/1A3Mthm9ksbfUnMCe5T2noGsiEV1RfChH?usp=sharing) Google Drive folder.
166
+
167
+ You may download the `all_samples.tar.gz` and then `tar xzvf all_samples.tar.gz`, which extracts all the pt files into `samples` folder for evaluation.
168
+
169
+ <!-- ## Demo
170
+ ### Host our web app demo locally
171
+
172
+ With ``gradio`` and ``gradio_molecule3d`` installed, you can simply run ``python app.py`` to open the demo locally. Port mapping has been set in Makefile if you are using docker. You should also forward this port if you run the docker in an ssh server. We will share a permanent demo link later.
173
+
174
+ Great thanks to @duerrsimon for his kind support in resolving rendering issues! -->
175
+
176
+ ## Citation
177
+
178
+ ```
179
  @article{qu2024molcraft,
180
  title={MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space},
181
  author={Qu, Yanru and Qiu, Keyue and Song, Yuxuan and Gong, Jingjing and Han, Jiawei and Zheng, Mingyue and Zhou, Hao and Ma, Wei-Ying},
 
189
  journal={ICLR 2024},
190
  year={2024}
191
  }
192
+ ```