File size: 11,300 Bytes
03e1878
fa25632
03e1878
 
 
 
 
 
 
 
fa25632
cde5d27
fa25632
cde5d27
6d0ef0d
cde5d27
 
 
 
 
 
 
 
 
 
 
 
 
 
6d0ef0d
cc7c981
cde5d27
cc7c981
cde5d27
cc7c981
cde5d27
 
 
 
fa25632
cde5d27
 
 
 
 
 
 
 
 
 
fa25632
 
 
 
cde5d27
 
 
 
 
 
 
 
 
fa25632
 
cde5d27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc7c981
cde5d27
 
 
fa25632
cde5d27
fa25632
cde5d27
 
 
 
 
cc7c981
 
 
cde5d27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc7c981
fa25632
cde5d27
cc7c981
cde5d27
 
cc7c981
cde5d27
 
cc7c981
cde5d27
 
 
 
 
 
 
 
 
 
 
 
 
cc7c981
 
cde5d27
 
cc7c981
 
fa25632
cc7c981
 
fa25632
 
cde5d27
 
fa25632
cc7c981
 
 
cde5d27
 
 
 
 
 
 
 
 
 
cc7c981
 
cde5d27
 
 
cc7c981
 
 
cde5d27
 
 
cc7c981
 
 
cde5d27
cc7c981
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
---
title: AmberPrep
emoji: 🧬
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
---

# AmberPrep

**AmberPrep** is a web-based pipeline for preparing structures, setting up molecular dynamics (MD) simulations with the AMBER force field. It integrates structure completion (ESMFold), preparation, force field parameterization, simulation file generation, and PLUMED-based biased MD in a single interface.

---

## Features

| Section | Description |
|--------|-------------|
| **Protein Loading** | Upload PDB files or fetch from RCSB PDB; 3D visualization with NGL |
| **Fill Missing Residues** | Detect missing residues (RCSB annotations), complete with ESMFold, optional trimming and energy minimization of predicted structure|
| **Structure Preparation** | Remove water/ions/H; add ACE/NME capping; chain and ligand selection; GAFF/GAFF2 parameterization |
| **Ligand Docking** | AutoDock Vina + Meeko; configurable search box; pose selection and use selected ligand pose to setup MD simulations |
| **Simulation Parameters** | Force fields (ff14SB, ff19SB), water models (TIP3P, SPCE), box size, temperature, pressure |
| **Simulation Steps** | Restrained minimization, minimization, NVT, NPT, production β€” each with configurable parameters |
| **Generate Files** | AMBER `.in` files, `prmtop`/`inpcrd`, PBS submission scripts |
| **PLUMED** | Collective variables (PLUMED v2.9), `plumed.dat` editor, and simulation file generation with PLUMED |

---

## Requirements for Custom PDB Files

For **custom PDB files** (uploaded or fetched), ensure:

| Requirement | Description |
|-------------|-------------|
| **Chain IDs** | Chain IDs must be clearly marked in the PDB (column 22). The pipeline uses them for chain selection, missing-residue filling, and structure preparation. |
| **Ligands as HETATM** | All non-protein, non-water, non-ion molecules (e.g., cofactors, drugs) must be in **HETATM** records. The pipeline detects and lists only HETATM entities as ligands. |
| **Standard amino acids** | AmberPrep supports **standard amino acids** only. Non-standard residues (e.g., MSE, HYP, SEC, non-canonical modifications) are not explicitly parameterized; pre-process or replace them before use if needed. |

For RCSB structures, the pipeline parses the header and HETATM as provided; for your own PDBs, apply the above conventions.

---

## Installation

### Option 1: Docker (recommended)

```bash
git clone https://github.com/your-org/AmberPrep.git
cd AmberPrep
docker build -t amberprep .
docker run -p 7860:7860 amberprep
```

Open `http://localhost:7860` in your browser.

### Option 2: Local (Conda + pip)

1. **Conda environment** with AMBER tools, PyMOL, and Python 3.10–3.11:

   ```bash
   conda create -n amberprep python=3.11
   conda activate amberprep
   conda install -c conda-forge ambertools pymol-open-source
   ```

2. **Conda packages for docking** (Vina, Open Babel; Meeko is via pip):

   ```bash
   conda install -c conda-forge autodock-vina openbabel
   ```

3. **Python packages**:

   ```bash
   pip install -r requirements.txt
   # or: pip install flask flask-cors biopython numpy pandas matplotlib seaborn mdanalysis gunicorn requests rdkit meeko vina
   ```

3. **Run the app**:

   ```bash
   python start_web_server.py
   ```

   The app listens on `http://0.0.0.0:7860` by default.

### Option 3: Install from PyPI (when published)

```bash
pip install amberprep
# Requires: AMBER tools, PyMOL, AutoDock Vina, Open Babel (e.g. conda install -c conda-forge ambertools pymol-open-source autodock-vina openbabel)
amberprep
```

See **[PACKAGING.md](PACKAGING.md)** for dependency list, build, and PyPI release steps.

---

## Usage

### 1. Protein Loading

- **Upload**: Drag-and-drop or choose a `.pdb` / `.ent` file.
- **Fetch**: Enter a 4-character PDB ID (e.g. `1CRN`) to download from RCSB.

After loading, the **Protein Preview** shows: structure ID, atom count, chains, residues, water, ions, ligands, and HETATM count. Use the 3D viewer to inspect the structure.

---

### 2. Fill Missing Residues

- Click **Analyze Missing Residues** to detect gaps from RCSB metadata.
- **Select chains** to complete with ESMFold.
- **Trim residues** (optional): remove residues from N- or C-terminal edges; internal loops are always filled by ESMFold.
- **Energy minimization** (optional): if you enable ESMFold completion, you can minimize selected chains to resolve clashes before docking. Recommended if receptor preparation (Meeko) fails later.
- **Build Completed Structure** to run ESMFold and (if requested) minimization. Use **Preview Completed Structure** and **View Superimposed Structures** to compare original and completed chains.

> If you use ESMFold in this workflow, please cite [ESM Atlas](https://esmatlas.com/about).

---

### 3. Structure Preparation

- **Remove**: Water, ions, and hydrogens (options are pre-configured).
- **Add capping**: ACE (N-terminal) and NME (C-terminal).
- **Chains**: Select which protein chains to keep for force field generation.
- **Ligands**:
  - **Preserve ligands** to keep them in the structure.
  - **Select ligands to preserve** (e.g. `GOL-A-1`, `LIZ-A`). Unselected ligands are dropped.
  - **Create separate ligand file** to export selected ligand(s) to a PDB.

Click **Prepare Structure**. The status panel reports original vs prepared atom counts, removed components, added capping, and preserved ligands. Use **View Prepared Structure** and **Download Prepared PDB** as needed.

**Ligand Docking** (nested in this tab):

- Select ligands to dock.
- Set the **search space** (center and size in X, Y, Z) with live 3D visualization.
- **Run Docking** (AutoDock Vina + Meeko). Progress and logs are shown in the docking panel.
- **Select poses** per ligand and **Use selected pose** to write the chosen pose into the structure for AMBER. You can switch modes (e.g. 1–9) and jump by clicking the mode labels.

---

### 4. Simulation Parameters

- **Force field**: ff14SB or ff19SB.
- **Water model**: TIP3P or SPCE.
- **Box size** (Γ…): padding for solvation.
- **Add ions**: to neutralize (and optionally reach a salt concentration).
- **Temperature** and **Pressure** (e.g. 300 K, 1 bar).
- **Time step** and **Cutoff** for non-bonded interactions.

If ligands were preserved, **Ligand force field** (GAFF/GAFF2) is configured here; net charge is computed before `antechamber` runs.

---

### 5. Simulation Steps

Enable/disable and set parameters for:

- **Restrained minimization** (steps, force constant)
- **Minimization** (steps, cutoff)
- **NVT heating** (steps, temperature)
- **NPT equilibration** (steps, temperature, pressure)
- **Production** (steps, temperature, pressure)

---

### 6. Generate Files

- **Generate All Files** to create AMBER inputs (`min_restrained.in`, `min.in`, `HeatNPT.in`, `mdin_equi.in`, `mdin_prod.in`), `tleap` scripts, `submit_job.pbs`, and (after `tleap`) `prmtop`/`inpcrd`.
- **Preview Files** to open and **edit** each file (e.g. `min.in`, `submit_job.pbs`) and **Save**; changes are written to the output directory.
- **Preview Solvated Protein** / **Download Solvated Protein** to inspect and download the solvated system.

For **PLUMED-based runs**, go to the **PLUMED** tab to configure CVs and `plumed.dat`, then use **Generate simulation files** there to produce inputs that include PLUMED.

---

### 7. PLUMED

- **Collective Variables**: search and select CVs from the PLUMED v2.9 set; view docs and add/edit lines in `plumed.dat`.
- **Custom PLUMED**: edit `plumed.dat` directly.
- **Generate simulation files**: create AMBER + PLUMED input files. Generated files can be **previewed, edited, and saved** as in the main **Generate Files** tab.

> PLUMED citation: [plumed.org/cite](https://www.plumed.org/cite).

---

## Pipeline Overview

```
Protein Loading (upload/fetch)
        ↓
Fill Missing Residues (detect β†’ ESMFold β†’ optional trim & minimize)
        ↓
Structure Preparation (clean, cap, chains, ligands) β†’ optional Docking (Vina, apply pose)
        ↓
Simulation Parameters (FF, water, box, T, P, etc.)
        ↓
Simulation Steps (min, NVT, NPT, prod)
        ↓
Generate Files (AMBER .in, tleap, prmtop/inpcrd, PBS)
        ↓
[Optional] PLUMED (CVs, plumed.dat, generate PLUMED-enabled files)
```

---

## Output Layout

Generated files are written under `output/` (or the path set in the app), for example:

- `0_original_input.pdb` β€” raw input
- `1_protein_no_hydrogens.pdb` β€” cleaned, capped, chain/ligand selection applied
- `2_protein_with_caps.pdb`, `tleap_ready.pdb` β€” intermediates
- `4_ligands_corrected_*.pdb` β€” prepared ligands
- `protein.prmtop`, `protein.inpcrd` β€” after `tleap`
- `min_restrained.in`, `min.in`, `HeatNPT.in`, `mdin_equi.in`, `mdin_prod.in`, `submit_job.pbs`
- `output/docking/` β€” receptor, ligands, Vina configs, poses, logs
- `plumed.dat` β€” when using PLUMED

---

## Dependencies

| Category | Tools / libraries |
|----------|-------------------|
| **Python** | Flask, Flask-CORS, BioPython, NumPy, Pandas, Matplotlib, Seaborn, MDAnalysis, Requests, RDKit, SciPy |
| **AMBER** | AMBER Tools (tleap, antechamber, sander, ambpdb, etc.) |
| **Docking** | Meeko (`mk_prepare_ligand`, `mk_prepare_receptor`), AutoDock Vina, Open Babel |
| **Visualization** | PyMOL (scripted for H removal, structure editing), NGL (in-browser 3D) |
| **Structure completion** | ESMFold (via API or local, depending on deployment) |

---

## Project Structure

```
AmberPrep/
β”œβ”€β”€ start_web_server.py      # Entry point
β”œβ”€β”€ html/
β”‚   β”œβ”€β”€ index.html           # Main UI
β”‚   └── plumed.html          # PLUMED-focused view (if used)
β”œβ”€β”€ css/
β”‚   β”œβ”€β”€ styles.css
β”‚   └── plumed.css
β”œβ”€β”€ js/
β”‚   β”œβ”€β”€ script.js            # Main frontend logic
β”‚   β”œβ”€β”€ plumed.js            # PLUMED + docking UI
β”‚   └── plumed_cv_docs.js    # CV documentation
β”œβ”€β”€ python/
β”‚   β”œβ”€β”€ app.py               # Flask backend, API, file generation
β”‚   β”œβ”€β”€ structure_preparation.py
β”‚   β”œβ”€β”€ add_caps.py          # ACE/NME capping
β”‚   β”œβ”€β”€ Fill_missing_residues.py  # ESMFold, trimming, minimization
β”‚   β”œβ”€β”€ docking.py           # Docking helpers
β”‚   └── docking_utils.py
β”œβ”€β”€ output/                  # Generated files (gitignored in dev)
β”œβ”€β”€ Dockerfile
└── README.md
```

---

## Citation

If you use AmberPrep in your work, please cite:

```bibtex
@software{AmberPrep,
  title = {AmberPrep: Molecular Dynamics and Docking Pipeline},
  author = {Nagar, Hemant},
  year = {2025},
  url = {https://github.com/your-org/AmberPrep}
}
```

**Related software to cite when used:**

- **AMBER**: [ambermd.org](https://ambermd.org)
- **PLUMED**: [plumed.org/cite](https://www.plumed.org/cite)
- **ESMFold / ESM Atlas**: [esmatlas.com/about](https://esmatlas.com/about)
- **AutoDock Vina**: Trott & Olson, *J. Comput. Chem.* (2010)
- **Meeko**: [github.com/forlilab/Meeko](https://github.com/forlilab/Meeko)

---

## Acknowledgments

- **Mohd Ibrahim** (Technical University of Munich) for the protein capping logic (`add_caps.py`).

---

## License

MIT License. See `LICENSE` for details.

---

## Contact

- **Author**: Hemant Nagar  
- **Email**: hn533621@ohio.edu