PhDFlo commited on
Commit
fd2a5f9
Β·
1 Parent(s): 35ace93

Text description modification

Browse files
Files changed (2) hide show
  1. app.py +24 -7
  2. introduction_page.md +5 -5
app.py CHANGED
@@ -286,7 +286,7 @@ with gr.Blocks(theme=theme) as demo:
286
  gr.Markdown(
287
  """
288
  # Protein Folding Simulation Interface
289
- This interface provides the tools to fold any FASTA chain based on Chai-1 model. Also, this is a MCP server to provide all the tools to automate the process of folding proteins with LLMs.
290
  """)
291
 
292
  with gr.Tab("Introduction πŸ”­"):
@@ -297,18 +297,35 @@ with gr.Blocks(theme=theme) as demo:
297
  """
298
  # Stakes
299
 
300
- The industry is being deeply changed by the development of LLMs and the recent possibilities to provide them access to external tools. For years, companies have used simulation tools to accelerate and reduce the cost of product development. One of the main challenges in the coming years will be to create agents that can set up, run, and process simulations to further accelerate innovation.
 
 
301
 
302
  # Objective
303
 
304
- This project is a first step in creating AI agents that perform simulations on existing software. Key domains include:
 
305
  - **CFD** (Computational Fluid Dynamics) simulations
306
  - **Biology** (Protein Folding, Molecular Dynamics, etc.)
307
  - **Neural network applications**
308
 
309
- This project focuses on protein folding, but the same principles can be applied to other domains. In particular it uses [Chai-1](https://www.chaidiscovery.com/blog/introducing-chai-1), which is a multi-modal foundation model for molecular structure prediction, performing at state-of-the-art levels across a variety of benchmarks. Chai-1 enables unified prediction of proteins, small molecules, DNA, RNA, glycosylations, and more. Using Chai-1 on Modal is a great example of running folding simulations.
 
 
310
 
311
- Industrial computations are often performed on HPC clusters with large resources, so simulations typically run on separate servers. The LLM must be able to access simulation results to provide complete answers to users. To this purpose, [Modal](https://modal.com/), a serverless platform that provides a simple way to run any application with the latest CPU and GPU hardware will be used.
 
 
 
 
 
 
 
 
 
 
 
 
312
 
313
  """
314
  )
@@ -384,7 +401,7 @@ with gr.Blocks(theme=theme) as demo:
384
  with gr.Row():
385
  with gr.Column(scale=1):
386
  inp2 = gr.FileExplorer(root_dir=here / "inputs/config",
387
- value="chai1_quick_inference.json",
388
  label="Configuration file",
389
  file_count='single')
390
 
@@ -422,7 +439,7 @@ with gr.Blocks(theme=theme) as demo:
422
  )
423
 
424
 
425
- with gr.Tab("Show molecule from a CIF file πŸ’»"):
426
 
427
  gr.Markdown(
428
  """
 
286
  gr.Markdown(
287
  """
288
  # Protein Folding Simulation Interface
289
+ This interface provides the tools to fold FASTA chains based on Chai-1 model. Also, this is a MCP server to provide all the tools to automate the process of folding proteins with LLMs.
290
  """)
291
 
292
  with gr.Tab("Introduction πŸ”­"):
 
297
  """
298
  # Stakes
299
 
300
+ The industry is undergoing a profound transformation due to the development of Large Language Models (LLMs) and the recent advancements that enable them to access external tools.
301
+ For years, companies have leveraged simulation tools to accelerate and reduce the costs of product development.
302
+ One of the primary challenges in the coming years will be to create agents capable of setting up, running, and processing simulations to further expedite innovation.
303
 
304
  # Objective
305
 
306
+ This project represents an initial step towards developing AI agents that can perform simulations using existing engineer softwares. It enables engineers to focus on analysis rather than setup.
307
+ Key domains of application include:
308
  - **CFD** (Computational Fluid Dynamics) simulations
309
  - **Biology** (Protein Folding, Molecular Dynamics, etc.)
310
  - **Neural network applications**
311
 
312
+ While this project focuses on protein folding, the principles employed can be extended to other domains.
313
+ Specifically, it utilizes [Chai-1](https://www.chaidiscovery.com/blog/introducing-chai-1), a multi-modal foundation model for molecular structure prediction that achieves state-of-the-art performance across various benchmarks.
314
+ Chai-1 enables unified prediction of proteins, small molecules, DNA, RNA, glycosylations, and more.
315
 
316
+ Industrial computations are frequently performed on High-Performance Computing (HPC) clusters with substantial resources, necessitating that simulations typically run on separate servers.
317
+ To provide comprehensive answers to users, the LLM must be able to access simulation results. To this end, [Modal](https://modal.com/), a serverless platform that offers a straightforward method to run any application with the latest CPU and GPU hardware, will be used.
318
+
319
+ # Benefits
320
+
321
+ 1. **Efficiency**: The MCP server's connected to high-performance computing capabilities ensure that simulations are run quickly and efficiently.
322
+
323
+ 2. **Ease of Use**: Only provide necessary parameters to the user to simplify the process of setting up and running complex simulations.
324
+
325
+ 3. **Integration**: The seamless integration between the LLM's chat interface and the MCP server allows for a streamlined workflow, from simulation setup to results analysis.
326
+
327
+ The following video illustrates a practical use of the MCP server to run a protein folding simulation using the Chai-1 model.
328
+ In this scenario, Copilot is used in Agent mode with Claude 3.5 Sonnet to leverage the tools provided by the MCP server.
329
 
330
  """
331
  )
 
401
  with gr.Row():
402
  with gr.Column(scale=1):
403
  inp2 = gr.FileExplorer(root_dir=here / "inputs/config",
404
+ value="chai1_default_inference.json",
405
  label="Configuration file",
406
  file_count='single')
407
 
 
439
  )
440
 
441
 
442
+ with gr.Tab("Plot CIF file πŸ’»"):
443
 
444
  gr.Markdown(
445
  """
introduction_page.md CHANGED
@@ -6,14 +6,14 @@ code[class*="language-bash"], pre[class*="language-bash"] {
6
 
7
  ---
8
 
9
- # Instructions
10
 
11
  <div style="background-color:#f5f5f5; border-radius:8px; padding:18px 24px; margin-bottom:24px; border:1px solid #cccccc;">
12
 
13
  ### 1. <span style="color:#e98935;">Create your JSON configuration file (Optional)</span>
14
  <small>Default configuration is available if you skip this step.</small>
15
 
16
- - In the `Configuration πŸ“¦` window, set your simulation parameters and generate the JSON config file. You can provide a file name in the dedicated box that will appear in the list of available configuration files. If you don't, a unique identifier will be assigned (e.g., `chai_{run_id}_config.json`).
17
  - **Parameters:**
18
  - <b>Number of diffusion time steps:</b> 1 to 500
19
  - <b>Number of trunk recycles:</b> 1 to 5
@@ -24,7 +24,7 @@ code[class*="language-bash"], pre[class*="language-bash"] {
24
  ### 2. <span style="color:#e98935;">Upload a FASTA file with your molecule sequence (Optional)</span>
25
  <small>Default FASTA files are available if you skip this step.</small>
26
 
27
- - In the `Configuration πŸ“¦` window, write your FASTA content and create the file. You can provide a file name in the dedicated box that will appear in the list of available configuration files. If you don't provide a file name a unique identifier will be assigned (e.g., `chai_{run_id}_input.fasta`). Also, if you don't provide a fasta content a default sequence will be written in the file.
28
  - <b style="color:#b91c1c;">Warning:</b> The header must be well formatted for Chai1 to process it.
29
 
30
  **FASTA template:**
@@ -77,7 +77,7 @@ In the `Run folding simulation πŸš€` window, refresh the file list by clicking o
77
 
78
  ### 4. <span style="color:#e98935;">Run the simulation</span>
79
 
80
- Press the `Run Simulation` button to start de folding Simulation. Five protein folding simulations will be performed. Unfortunately, this parameter is hard coded in Chai-1. The simulation time is expected to be from 2min to 10min depending on the molecule.
81
 
82
  ### 5. <span style="color:#e98935;">Analyse the results of your simulation</span>
83
 
@@ -85,6 +85,6 @@ To analyse the results of the simulation, two outputs are provided:
85
  - A table showing the score of the 5 folding performed
86
  - Interactive 3D visualization of the molecule
87
 
88
- Finally, you can get to the `Show molecule from a CIF file πŸ’»` window to watch the cif files. This is mainly used to visualize CIF files after using this tool as an MCP server.
89
 
90
  </div>
 
6
 
7
  ---
8
 
9
+ # Gradio interface instructions
10
 
11
  <div style="background-color:#f5f5f5; border-radius:8px; padding:18px 24px; margin-bottom:24px; border:1px solid #cccccc;">
12
 
13
  ### 1. <span style="color:#e98935;">Create your JSON configuration file (Optional)</span>
14
  <small>Default configuration is available if you skip this step.</small>
15
 
16
+ - In the `Configuration πŸ“¦` window, set your simulation parameters and generate the JSON config file. You can provide a file name in the dedicated box that will appear in the list of available configuration files. If you don't, a unique identifier will be assigned (e.g., `chai_{unique_id}_config.json`).
17
  - **Parameters:**
18
  - <b>Number of diffusion time steps:</b> 1 to 500
19
  - <b>Number of trunk recycles:</b> 1 to 5
 
24
  ### 2. <span style="color:#e98935;">Upload a FASTA file with your molecule sequence (Optional)</span>
25
  <small>Default FASTA files are available if you skip this step.</small>
26
 
27
+ - In the `Configuration πŸ“¦` window, write your FASTA content and create the file. You can provide a file name in the dedicated box that will appear in the list of available configuration files. If you don't provide a file name a unique identifier will be assigned (e.g., `chai_{unique_id}_input.fasta`). Also, if you don't provide a fasta content a default sequence will be written in the file.
28
  - <b style="color:#b91c1c;">Warning:</b> The header must be well formatted for Chai1 to process it.
29
 
30
  **FASTA template:**
 
77
 
78
  ### 4. <span style="color:#e98935;">Run the simulation</span>
79
 
80
+ Press the `Run Simulation` button to start de folding Simulation. Five proteins folding simulations will be performed. This parameter is hard coded in Chai-1. The simulation time is expected to be from 2min to 10min depending on the molecule.
81
 
82
  ### 5. <span style="color:#e98935;">Analyse the results of your simulation</span>
83
 
 
85
  - A table showing the score of the 5 folding performed
86
  - Interactive 3D visualization of the molecule
87
 
88
+ Finally, you can get to the `Plot CIF file πŸ’»` window to watch the cif files. This is mainly used to visualize CIF files after using this tool as an MCP server.
89
 
90
  </div>