--- title: MCP Modal Protein Folding emoji: 🧬 colorFrom: gray colorTo: green sdk: gradio sdk_version: 5.33.0 app_file: app.py pinned: false license: apache-2.0 short_description: MCP server to simulate protein folding on Modal cluster tags: - mcp-server-track - Modal --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference  # Stakes The industry is undergoing a profound transformation due to the development of Large Language Models (LLMs) and the recent advancements that enable them to access external tools. For years, companies have leveraged simulation tools to accelerate and reduce the costs of product development. One of the primary challenges in the coming years will be to create agents capable of setting up, running, and processing simulations to further expedite innovation. Engineers will focus on analysis rather than simulation setup, allowing them to concentrate on the most critical aspects of their work. # Objective This project represents a first step towards developing AI agents that can perform simulations using existing engineering softwares. Key domains of application include: - **CFD** (Computational Fluid Dynamics) simulations - **Biology** (Protein Folding, Molecular Dynamics, etc.) - **Neural network applications** While this project focuses on biomolecules folding, the principles employed can be extended to other domains. Specifically, it uses [Chai-1](https://www.chaidiscovery.com/blog/introducing-chai-1), a multi-modal foundation model for molecular structure prediction that achieves state-of-the-art performance across various benchmarks. Chai-1 enables unified prediction of proteins, small molecules, DNA, RNA, glycosylations, and more. Industrial computations frequently require substantial resources (large number of CPUs and GPUs) that are performed on High-Performance Computing (HPC) clusters. To this end, [Modal Labs](https://modal.com/), a serverless platform that offers a straightforward method to run any application with the latest CPU and GPU hardware, will be used. MCP servers are an efficient solution to connect LLMs to real world engineering applications by providing access to a set of tools. The purpose of this project is to enable users to run biomolecule folding simulations using the Chai-1 model through any LLM chat or with a Gradio interface. # Benefits 1. **Efficiency**: The MCP server's connected to high-performance computing capabilities ensure that simulations are run quickly and efficiently. 2. **Ease of Use**: Only provide necessary parameters to the user to simplify the process of setting up and running complex simulations. 3. **Integration**: The seamless integration between the LLM's chat interface and the MCP server allows for a streamlined workflow, from simulation setup to results analysis. The following video illustrates a practical use of the MCP server to run a biomolecules folding simulation using the Chai-1 model. In this scenario, Copilot is used in Agent mode with Claude 3.5 Sonnet to leverage the tools provided by the MCP server. # MCP tools 1. `create_fasta_file`: Create a FASTA file from a biomolecule sequence string with a unique name. 2. `create_json_config`: Create a JSON configuration file from the Gradio interface inputs. 3. `compute_Chai1`: Compute a Chai-1 simulation on Modal labs server. Return a DataFrame with predicted scores: aggregated, pTM and ipTM. 4. `plot_protein`: Plot the 3D structure of a biomolecule using the DataFrame from `compute_Chai1` (Use for Gradio interface). 5. `show_cif_file`: Plot a 3D structure from a CIF file with the Molecule3D library (Use for the Gradio interface). # Result example The following image shows an example of a protein folding simulation using the Chai-1 model. The simulation was run with the default configuration and the image is 3D view from the Gradio interface.  # What's next? 1. Expose additional tools to post-process the results of the simulations. The current post-processing tools are suited for the Gradio interface (ex: Plot images of the molecule structure from a file). 2. Continue the pipeline by adding softawres like [OpenMM](https://openmm.org/) or [Gromacs](https://www.gromacs.org/) for molecular dynamics simulations. 3. Perform complete simulation plans including loops over parameters fully automated by the LLM. # Contact For any issues or questions, please contact the developer or refer to the documentation. # Environment creation with uv Run the following in a bash shell: ```bash uv venv source .venv/bin/activate uv pip install gradio[mcp] modal gemmi gradio_molecule3d ``` # Connect to Modal Create an account on Modal [website](https://modal.com) and run in your local terminal: ``` python -m modal setup ``` # Run the app Run in a bash shell: ```bash gradio app.py ``` # Gradio interface instructions