Spaces:
Running
Running
File size: 12,982 Bytes
b8a800e 31fb90f b8a800e 31fb90f b8a800e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 | ---
title: AmberMDFlow
emoji: π§¬
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
short_description: Web-based MD pipeline to setup simulation for AMBER
---
# AmberMDFlow
**AmberMDFlow** is a web-based pipeline for preparing structures, setting up molecular dynamics (MD) simulations with the AMBER force field. It integrates structure completion (ESMFold), preparation, force field parameterization, simulation file generation, and PLUMED-based biased MD in a single interface. This is the beta version.
---
## Note
- If you plan to dock ligands and have filled missing residues in the protein chain using ESMFold, you should also energy-minimize the structure.
- The **Fill Missing Residues** option works only for PDB files retrieved from the RCSB database, as it relies on the REMARK 465 records to identify missing residues.
- The public ESMFold API is used to predict protein structures from input sequences, and it supports sequences of up to 400 amino acids.
---
## Features
| Section | Description |
|--------|-------------|
| **Protein Loading** | Upload PDB files or fetch from RCSB PDB; 3D visualization with NGL |
| **Fill Missing Residues** | Detect missing residues (RCSB annotations), complete with ESMFold, optional trimming and energy minimization of predicted structure|
| **Structure Preparation** | Remove water/ions/H; add ACE/NME capping; chain and ligand selection; GAFF/GAFF2 parameterization |
| **Ligand Docking** | AutoDock Vina + Meeko; configurable search box; pose selection and use selected ligand pose to setup MD simulations |
| **Simulation Parameters** | Force fields (ff14SB, ff19SB), water models (TIP3P, SPCE), box size, temperature, pressure |
| **Simulation Steps** | Restrained minimization, minimization, NVT, NPT, production β each with configurable parameters |
| **Generate Files** | AMBER `.in` files, `prmtop`/`inpcrd`, PBS submission scripts |
| **PLUMED** | Collective variables (PLUMED v2.9), `plumed.dat` editor, and simulation file generation with PLUMED |
---
## Requirements for Custom PDB Files
For **custom PDB files** (uploaded or fetched), ensure:
| Requirement | Description |
|-------------|-------------|
| **Chain IDs** | Chain IDs (A,B,C..) must be clearly marked in the PDB file. |
| **Ligands as HETATM** | All ligands must be in **HETATM** records with name and IDs marked (LIG and A). |
| **Standard amino acids** | AmberMDFlow supports **standard amino acids** only. Non-standard residues and residues with PTMs are currently not supported in the pipeline. |
For RCSB structures, the pipeline parses the header and HETATM as provided; for your own PDBs, apply the above conventions.
---
## Quick Start
Try AmberMDFlow instantly on Hugging Face Spaces (no installation required):
**[https://huggingface.co/spaces/hemantn/AmberMDFlow](https://huggingface.co/spaces/hemantn/AmberMDFlow)**
---
## Installation
### Prerequisites
AmberMDFlow requires scientific packages that are only available via **conda** (not PyPI). You must install these first:
| Package | Purpose |
|---------|---------|
| `ambertools` | AMBER MD tools (tleap, antechamber, sander) |
| `pymol-open-source` | Structure visualization and editing |
| `autodock-vina` | AutoDock Vina 1.1.2 molecular docking (from bioconda) |
| `openbabel` | Molecule format conversion |
| `rdkit` | Cheminformatics toolkit |
| `gemmi` | Structure file parsing (required by Meeko) |
---
### Option 1: pip install (recommended)
```bash
# Step 1: Create conda environment with required tools
conda create -n ambermdflow python=3.11 -y
conda activate ambermdflow
# Step 2: Install conda-only dependencies
conda install -c conda-forge -c bioconda ambertools pymol-open-source autodock-vina openbabel rdkit gemmi -y
# Step 3: Install AmberMDFlow from Test PyPI
pip install --extra-index-url https://test.pypi.org/simple/ ambermdflow
# Step 4: Run the web app
ambermdflow
```
Open your browser at **http://localhost:7860**
---
### Option 2: Docker (no conda/pip needed)
**build from source:**
```bash
git clone https://github.com/nagarh/AmberMDFlow.git
cd AmberMDFlow
docker build -t ambermdflow .
docker run -p 7860:7860 ambermdflow
```
Open your browser at **http://localhost:7860**
---
### Troubleshooting
| Issue | Solution |
|-------|----------|
| `ModuleNotFoundError: No module named 'gemmi'` | Run: `conda install -c conda-forge gemmi` |
| `vina: command not found` | Run: `conda install -c conda-forge vina` |
| Port 7860 already in use | Kill the process or edit `start_web_server.py` to use a different port |
---
## Usage
### 1. Protein Loading
- **Upload**: Drag-and-drop or choose a `.pdb` file.
- **Fetch**: Enter a 4-character PDB ID (e.g. `1HPV`) to download from RCSB.
After loading, the **Protein Preview** shows: structure ID, atom count, chains, residues, water, ions, ligands, and HETATM count. Use the 3D viewer to inspect the structure.
---
### 2. Fill Missing Residues
- Click **Analyze Missing Residues** to detect gaps from RCSB metadata.
- **Select chains** to complete with ESMFold.
- **Trim residues** (optional): remove residues from N- or C-terminal edges; internal loops are always filled by ESMFold.
- **Energy minimization** (optional): if you enable ESMFold completion, you can minimize selected chains to resolve clashes before docking. Recommended if receptor preparation (Meeko) fails later.
- **Build Completed Structure** to run ESMFold and (if requested) minimization. Use **Preview Completed Structure** and **View Superimposed Structures** to compare original and completed chains.
> If you use ESMFold in this workflow, please cite [ESM Atlas](https://esmatlas.com/about).
---
### 3. Structure Preparation
- **Remove**: Water, ions, and hydrogens (options are pre-configured).
- **Add capping**: ACE (N-terminal) and NME (C-terminal).
- **Chains**: Select which protein chains to keep for force field generation.
- **Ligands**:
- **Preserve ligands** to keep them in the structure.
- **Select ligands to preserve** (e.g. `GOL-A-1`, `LIZ-A`). Unselected ligands are dropped.
- **Create separate ligand file** to export selected ligand(s) to a PDB.
- **Protonate** ligand using Open Babel.
Click **Prepare Structure**. The status panel reports original vs prepared atom counts, removed components, added capping, and preserved ligands. Use **View Prepared Structure** and **Download Prepared PDB** as needed.
**Ligand Docking** (nested in this tab):
- Select ligands to dock.
- Set the **search space** (center and size in X, Y, Z) with live 3D visualization.
- **Run Docking** (AutoDock Vina + Meeko). Progress and logs are shown in the docking panel.
- **Select poses** per ligand and **Use selected pose** to write the chosen pose into the structure for AMBER. You can switch modes (e.g. 1β9) and jump by clicking the mode labels.
---
### 4. Simulation Parameters
- **Force field**: ff14SB or ff19SB.
- **Water model**: TIP3P or SPCE.
- **Box size** (Γ
): padding for solvation.
- **Add ions**: to neutralize (and optionally reach a salt concentration).
- **Temperature** and **Pressure** (e.g. 300 K, 1 bar).
- **Time step** and **Cutoff** for non-bonded interactions.
If ligands were preserved, **Ligand force field** (GAFF/GAFF2) is configured here; net charge is computed before `antechamber` runs.
---
### 5. Simulation Steps
Enable/disable and set parameters for:
- **Restrained minimization** (steps, force constant)
- **Minimization** (steps, cutoff)
- **NVT heating** (steps, temperature)
- **NPT equilibration** (steps, temperature, pressure)
- **Production** (steps, temperature, pressure)
---
### 6. Generate Files
- **Generate All Files** to create AMBER inputs (`min_restrained.in`, `min.in`, `HeatNPT.in`, `mdin_equi.in`, `mdin_prod.in`), `tleap` scripts, `submit_job.pbs`, and (after `tleap`) `prmtop`/`inpcrd`.
- **Preview Files** to open and **edit** each file (e.g. `min.in`, `submit_job.pbs`) and **Save**; changes are written to the output directory.
- **Preview Solvated Protein** / **Download Solvated Protein** to inspect and download the solvated system.
For **PLUMED-based runs**, go to the **PLUMED** tab to configure CVs and `plumed.dat`, then use **Generate simulation files** there to produce inputs that include PLUMED.
---
### 7. PLUMED
- **Collective Variables**: search and select CVs from the PLUMED v2.9 set; view docs and add/edit lines in `plumed.dat`.
- **Custom PLUMED**: edit `plumed.dat` directly.
- **Generate simulation files**: create AMBER + PLUMED input files. Generated files can be **previewed, edited, and saved** as in the main **Generate Files** tab.
> PLUMED citation: [plumed.org/cite](https://www.plumed.org/cite).
---
## Pipeline Overview
```
Protein Loading (upload/fetch)
β
Fill Missing Residues (detect β ESMFold β optional trim & minimize)
β
Structure Preparation (clean, cap, chains, ligands) β optional Docking (Vina, select pose)
β
Simulation Parameters (FF, water, box, T, P, etc.)
β
Simulation Steps (min, NVT, NPT, prod)
β
Generate Files (AMBER .in, tleap, prmtop/inpcrd, PBS)
β
[Optional] PLUMED (CVs, plumed.dat, generate PLUMED-enabled files)
```
---
## Output Layout
Generated files are written under `output/` (or the path set in the app), for example:
- `0_original_input.pdb` β raw input
- `1_protein_no_hydrogens.pdb` β cleaned, capped, chain/ligand selection applied
- `2_protein_with_caps.pdb`, `tleap_ready.pdb` β intermediates
- `4_ligands_corrected_*.pdb` β prepared ligands
- `protein.prmtop`, `protein.inpcrd` β after `tleap`
- `min_restrained.in`, `min.in`, `HeatNPT.in`, `mdin_equi.in`, `mdin_prod.in`, `submit_job.pbs`
- `output/docking/` β receptor, ligands, Vina configs, poses, logs
- `plumed.dat` β when using PLUMED
---
## Multi-user deployment (e.g. Hugging Face Spaces)
When multiple users use the app at the same time (e.g. on Hugging Face Spaces), each user gets an **isolated output folder** so one userβs files are not overwritten by anotherβs. The app assigns a session ID when the page loads; all API requests send this ID and generated files are stored under `output/<session_id>/`. No configuration is requiredβthis works automatically in multi-user and single-user setups.
---
## Dependencies
| Category | Tools / libraries |
|----------|-------------------|
| **Python** | Flask, Flask-CORS, BioPython, NumPy, Pandas, Matplotlib, Seaborn, MDAnalysis, Requests, RDKit, SciPy |
| **AMBER** | AMBER Tools (tleap, antechamber, sander, ambpdb, etc.) |
| **Docking** | Meeko (`mk_prepare_ligand`, `mk_prepare_receptor`), AutoDock Vina, Open Babel |
| **Visualization** | PyMOL (scripted for H removal, structure editing), NGL (in-browser 3D) |
| **Structure completion** | ESMFold (via API or local, depending on deployment) |
---
## Project Structure
```
AmberMDFlow/
βββ start_web_server.py # Entry point
βββ html/
β βββ index.html # Main UI
β βββ plumed.html # PLUMED-focused view (if used)
βββ css/
β βββ styles.css
β βββ plumed.css
βββ js/
β βββ script.js # Main frontend logic
β βββ plumed.js # PLUMED + docking UI
β βββ plumed_cv_docs.js # CV documentation
βββ python/
β βββ app.py # Flask backend, API, file generation
β βββ structure_preparation.py
β βββ add_caps.py # ACE/NME capping
β βββ Fill_missing_residues.py # ESMFold, trimming, minimization
β βββ docking.py # Docking helpers
β βββ docking_utils.py
βββ output/ # Generated files (gitignored in dev)
βββ Dockerfile
βββ README.md
```
---
## Citation
If you use AmberMDFlow in your work, please cite:
```bibtex
@software{AmberMDFlow,
title = {AmberMDFlow: Molecular Dynamics and Docking Pipeline},
author = {Nagar, Hemant},
year = {2025},
url = {https://github.com/nagarh/AmberMDFlow}
```
**Related software to cite when used:**
- **AMBER**: [ambermd.org](https://ambermd.org)
- **PLUMED**: [plumed.org/cite](https://www.plumed.org/cite)
- **ESMFold / ESM Atlas**: [esmatlas.com/about](https://esmatlas.com/about)
- **AutoDock Vina**: [autodock-vina/cite](https://autodock-vina.readthedocs.io/en/latest/citations.html)
- **Meeko**: [github.com/forlilab/Meeko](https://github.com/forlilab/Meeko)
- **MDAnalysis**: [mdanalysis/cite](https://www.mdanalysis.org/pages/citations/)
- **NGL Viewer**: [nglviewer/cite](https://doi.org/10.1093/bioinformatics/bty419)
- **PyMOL**: [pymol/cite](https://www.pymol.org/support.html)
---
## Acknowledgments
- **Mohd Ibrahim** (Technical University of Munich) for the protein capping logic (`add_caps.py`).
---
## License
MIT License. See `LICENSE` for details.
---
## Contact
- **Author**: Hemant Nagar
- **Email**: hn533621@ohio.edu
|