Spaces:
Running
Running
| title: GSS DiffDock Engine | |
| emoji: 🧬 | |
| colorFrom: purple | |
| colorTo: pink | |
| sdk: gradio | |
| sdk_version: "4.36.1" | |
| python_version: "3.10" | |
| app_file: app.py | |
| pinned: false | |
| # DiffDock API Layer for Window 8 Drug Development | |
| ## Overview | |
| This directory contains the optimized DiffDock molecular docking engine designed to run on Hugging Face's **free CPU Basic tier** (2 vCPUs). It provides protein-ligand binding affinity predictions for drug development analysis. | |
| ## Architecture | |
| - **Platform**: Hugging Face Spaces (Gradio SDK) | |
| - **Hardware**: CPU Basic (Free Tier - 2 vCPUs) | |
| - **Framework**: DiffDock neural network for molecular docking | |
| - **API**: RESTful endpoint for Cloudflare Worker integration | |
| ## Files | |
| ### 1. `packages.txt` | |
| System-level dependencies installed before Python setup: | |
| - `unzip` - Archive extraction | |
| - `wget` - File downloads | |
| - `libgl1-mesa-glx` - OpenGL support for molecular visualization | |
| ### 2. `requirements.txt` | |
| Python dependencies optimized for CPU execution: | |
| - **PyTorch 2.2.1** (CPU-only build) | |
| - **torch-geometric 2.5.2** - Graph neural networks | |
| - **biopython 1.83** - Biological computation | |
| - **rdkit 2023.9.5** - Chemical informatics | |
| - **gradio 4.19.2** - Web interface and API | |
| - **pandas 2.2.1** - Data manipulation | |
| - **pyyaml 6.0.1** - Configuration parsing | |
| - **scipy 1.12.0** - Scientific computing | |
| - **networkx 3.2.1** - Graph algorithms | |
| ### 3. `app.py` | |
| Main application with three key components: | |
| #### CPU Optimization | |
| ```python | |
| torch.set_num_threads(2) | |
| os.environ["OMP_NUM_THREADS"] = "2" | |
| os.environ["MKL_NUM_THREADS"] = "2" | |
| ``` | |
| Limits thread usage to match free tier allocation. | |
| #### Automated Setup | |
| - Clones DiffDock repository | |
| - Downloads pre-trained weights from Zenodo | |
| - Configures inference pipeline | |
| #### API Endpoint | |
| - **Function**: `run_diffdock_inference(protein_pdb_content, ligand_smiles_string)` | |
| - **Input**: | |
| - Protein structure (PDB format) | |
| - Ligand molecule (SMILES string) | |
| - **Output**: JSON with confidence score | |
| - **API Name**: `execute_diffdock_prediction` | |
| ## Deployment Steps | |
| ### 1. Create Hugging Face Space | |
| 1. Go to https://huggingface.co/spaces | |
| 2. Click **"Create a New Space"** | |
| 3. Name: `gss-diffdock-engine` (or your preferred name) | |
| 4. SDK: **Gradio** | |
| 5. Hardware: **CPU Basic** (Free) | |
| 6. Visibility: Public or Private | |
| ### 2. Upload Files | |
| Upload these three files to your Space repository: | |
| - `packages.txt` | |
| - `requirements.txt` | |
| - `app.py` | |
| ### 3. Wait for Build | |
| Hugging Face will: | |
| 1. Install system packages (1-2 minutes) | |
| 2. Install Python dependencies (3-5 minutes) | |
| 3. Clone DiffDock and download weights (5-10 minutes) | |
| 4. Start the application | |
| Total build time: **10-15 minutes** | |
| ### 4. Verify Deployment | |
| Once status shows **"Running"**: | |
| - The Space URL will be active | |
| - API endpoint will be available at: `https://YOUR-USERNAME-gss-diffdock-engine.hf.space/api/execute_diffdock_prediction` | |
| ## API Usage | |
| ### Request Format | |
| ```bash | |
| curl -X POST "https://YOUR-USERNAME-gss-diffdock-engine.hf.space/api/execute_diffdock_prediction" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "data": [ | |
| "PROTEIN_PDB_CONTENT_HERE", | |
| "LIGAND_SMILES_STRING_HERE" | |
| ] | |
| }' | |
| ``` | |
| ### Response Format | |
| ```json | |
| { | |
| "data": [{ | |
| "success": true, | |
| "diffdock_confidence_score": 0.85, | |
| "hardware_allocation": "HF_FREE_CPU_TIER" | |
| }] | |
| } | |
| ``` | |
| ## Performance Optimizations | |
| ### Memory Management | |
| - **Inference steps**: Limited to 10 (vs default 20) | |
| - **Samples per complex**: 1 (vs default 40) | |
| - **Cleanup**: Automatic removal of temporary files | |
| ### CPU Constraints | |
| - Thread count capped at 2 | |
| - Single pose generation | |
| - Aggressive memory cleanup | |
| ## Integration with Cloudflare Worker | |
| The next step is to create a Cloudflare Worker handler that: | |
| 1. Receives drug development requests from Window 8 | |
| 2. Formats protein/ligand data | |
| 3. Calls this Hugging Face API | |
| 4. Stores results in D1 database | |
| 5. Returns predictions to frontend | |
| ## Troubleshooting | |
| ### Build Failures | |
| - Check logs for missing dependencies | |
| - Verify file names are exact (case-sensitive) | |
| - Ensure no extra whitespace in files | |
| ### Timeout Errors | |
| - Inference is limited to 10 steps for speed | |
| - Consider upgrading to paid tier for faster processing | |
| ### Memory Issues | |
| - Current config optimized for 16GB RAM limit | |
| - Reduce inference steps if needed | |
| ## Next Steps | |
| 1. ✅ Deploy to Hugging Face Spaces | |
| 2. ⏳ Create Cloudflare Worker integration | |
| 3. ⏳ Add D1 database schema for drug predictions | |
| 4. ⏳ Build Window 8 frontend interface | |
| 5. ⏳ Implement result visualization | |
| ## Support | |
| For issues or questions: | |
| - Hugging Face Docs: https://huggingface.co/docs/hub/spaces | |
| - DiffDock Paper: https://arxiv.org/abs/2210.01776 | |
| - DiffDock Repo: https://github.com/gcorso/DiffDock | |
| --- | |
| **Gaston Software Solutions LLP** | |
| Window 8: Drug Development & Molecular Docking Engine |