--- title: GSS DiffDock Engine emoji: 🧬 colorFrom: purple colorTo: pink sdk: gradio sdk_version: "4.36.1" python_version: "3.10" app_file: app.py pinned: false --- # DiffDock API Layer for Window 8 Drug Development ## Overview This directory contains the optimized DiffDock molecular docking engine designed to run on Hugging Face's **free CPU Basic tier** (2 vCPUs). It provides protein-ligand binding affinity predictions for drug development analysis. ## Architecture - **Platform**: Hugging Face Spaces (Gradio SDK) - **Hardware**: CPU Basic (Free Tier - 2 vCPUs) - **Framework**: DiffDock neural network for molecular docking - **API**: RESTful endpoint for Cloudflare Worker integration ## Files ### 1. `packages.txt` System-level dependencies installed before Python setup: - `unzip` - Archive extraction - `wget` - File downloads - `libgl1-mesa-glx` - OpenGL support for molecular visualization ### 2. `requirements.txt` Python dependencies optimized for CPU execution: - **PyTorch 2.2.1** (CPU-only build) - **torch-geometric 2.5.2** - Graph neural networks - **biopython 1.83** - Biological computation - **rdkit 2023.9.5** - Chemical informatics - **gradio 4.19.2** - Web interface and API - **pandas 2.2.1** - Data manipulation - **pyyaml 6.0.1** - Configuration parsing - **scipy 1.12.0** - Scientific computing - **networkx 3.2.1** - Graph algorithms ### 3. `app.py` Main application with three key components: #### CPU Optimization ```python torch.set_num_threads(2) os.environ["OMP_NUM_THREADS"] = "2" os.environ["MKL_NUM_THREADS"] = "2" ``` Limits thread usage to match free tier allocation. #### Automated Setup - Clones DiffDock repository - Downloads pre-trained weights from Zenodo - Configures inference pipeline #### API Endpoint - **Function**: `run_diffdock_inference(protein_pdb_content, ligand_smiles_string)` - **Input**: - Protein structure (PDB format) - Ligand molecule (SMILES string) - **Output**: JSON with confidence score - **API Name**: `execute_diffdock_prediction` ## Deployment Steps ### 1. Create Hugging Face Space 1. Go to https://huggingface.co/spaces 2. Click **"Create a New Space"** 3. Name: `gss-diffdock-engine` (or your preferred name) 4. SDK: **Gradio** 5. Hardware: **CPU Basic** (Free) 6. Visibility: Public or Private ### 2. Upload Files Upload these three files to your Space repository: - `packages.txt` - `requirements.txt` - `app.py` ### 3. Wait for Build Hugging Face will: 1. Install system packages (1-2 minutes) 2. Install Python dependencies (3-5 minutes) 3. Clone DiffDock and download weights (5-10 minutes) 4. Start the application Total build time: **10-15 minutes** ### 4. Verify Deployment Once status shows **"Running"**: - The Space URL will be active - API endpoint will be available at: `https://YOUR-USERNAME-gss-diffdock-engine.hf.space/api/execute_diffdock_prediction` ## API Usage ### Request Format ```bash curl -X POST "https://YOUR-USERNAME-gss-diffdock-engine.hf.space/api/execute_diffdock_prediction" \ -H "Content-Type: application/json" \ -d '{ "data": [ "PROTEIN_PDB_CONTENT_HERE", "LIGAND_SMILES_STRING_HERE" ] }' ``` ### Response Format ```json { "data": [{ "success": true, "diffdock_confidence_score": 0.85, "hardware_allocation": "HF_FREE_CPU_TIER" }] } ``` ## Performance Optimizations ### Memory Management - **Inference steps**: Limited to 10 (vs default 20) - **Samples per complex**: 1 (vs default 40) - **Cleanup**: Automatic removal of temporary files ### CPU Constraints - Thread count capped at 2 - Single pose generation - Aggressive memory cleanup ## Integration with Cloudflare Worker The next step is to create a Cloudflare Worker handler that: 1. Receives drug development requests from Window 8 2. Formats protein/ligand data 3. Calls this Hugging Face API 4. Stores results in D1 database 5. Returns predictions to frontend ## Troubleshooting ### Build Failures - Check logs for missing dependencies - Verify file names are exact (case-sensitive) - Ensure no extra whitespace in files ### Timeout Errors - Inference is limited to 10 steps for speed - Consider upgrading to paid tier for faster processing ### Memory Issues - Current config optimized for 16GB RAM limit - Reduce inference steps if needed ## Next Steps 1. ✅ Deploy to Hugging Face Spaces 2. ⏳ Create Cloudflare Worker integration 3. ⏳ Add D1 database schema for drug predictions 4. ⏳ Build Window 8 frontend interface 5. ⏳ Implement result visualization ## Support For issues or questions: - Hugging Face Docs: https://huggingface.co/docs/hub/spaces - DiffDock Paper: https://arxiv.org/abs/2210.01776 - DiffDock Repo: https://github.com/gcorso/DiffDock --- **Gaston Software Solutions LLP** Window 8: Drug Development & Molecular Docking Engine