Update README.md
Browse files
README.md
CHANGED
|
@@ -27,7 +27,7 @@ This model strikes a balance between reconstruction fidelity and structural loca
|
|
| 27 |
|
| 28 |
- Encoding Structures into Tokens (See [genbio-ai/AIDO.StructureEncoder](https://huggingface.co/genbio-ai/AIDO.StructureEncoder))
|
| 29 |
- Decoding Tokens into Structures (See [genbio-ai/AIDO.StructureDecoder](https://huggingface.co/genbio-ai/AIDO.StructureDecoder))
|
| 30 |
-
- Reconstructing Structures (See [below](#
|
| 31 |
- Structure Prediction (See [this section](https://huggingface.co/genbio-ai/AIDO.Protein2StructureToken-16B/blob/main/README.md#structure-prediction) in genbio-ai/AIDO.Protein2StructureToken-16B)
|
| 32 |
|
| 33 |
## Results
|
|
@@ -42,12 +42,10 @@ This model strikes a balance between reconstruction fidelity and structural loca
|
|
| 42 |
## How to Use
|
| 43 |
Please see `experiments/AIDO.StructureTokenizer` in [Model Generator](https://github.com/genbio-ai/modelgenerator) for more details.
|
| 44 |
|
| 45 |
-
###
|
| 46 |
-
|
| 47 |
-
#### Setup
|
| 48 |
Install [Model Generator](https://github.com/genbio-ai/modelgenerator)
|
| 49 |
|
| 50 |
-
|
| 51 |
|
| 52 |
To reproduce the reconstruction results in the paper, we provide a preprocessed CASP15 dataset at [genbio-ai/sample-structure-dataset](https://huggingface.co/datasets/genbio-ai/sample-structure-dataset). It could be downloaded via
|
| 53 |
```bash
|
|
@@ -75,7 +73,7 @@ python experiments/AIDO.StructureTokenizer/register_dataset.py \
|
|
| 75 |
|
| 76 |
You need to replace the `folder_path` and the `registry_path` in the following steps accordingly.
|
| 77 |
|
| 78 |
-
|
| 79 |
|
| 80 |
If you use the provided CASP15 dataset, you can run the combined encoding and decoding task using the following command:
|
| 81 |
```bash
|
|
@@ -107,7 +105,7 @@ The input and the output can be summarized as follows:
|
|
| 107 |
- Currently, this function only supports single GPU inference due to the file saving mechanism. We plan to support multi-GPU inference in the future.
|
| 108 |
- The reconstructed structures are aligned to the original structures using the Kabsch algorithm. This makes it easier to visualize and compare the structures.
|
| 109 |
|
| 110 |
-
|
| 111 |
|
| 112 |
We use VS Code + [Protein Viewer Extension](https://marketplace.visualstudio.com/items?itemName=ArianJamasb.protein-viewer) to visualize the protein structures. It's a beginner-friendly tool for VS Code users. You could also use your preferred protein structure viewer to visualize the structures (e.g., PyMOL, ChimeraX, etc.), but here we focus on this extension.
|
| 113 |
|
|
|
|
| 27 |
|
| 28 |
- Encoding Structures into Tokens (See [genbio-ai/AIDO.StructureEncoder](https://huggingface.co/genbio-ai/AIDO.StructureEncoder))
|
| 29 |
- Decoding Tokens into Structures (See [genbio-ai/AIDO.StructureDecoder](https://huggingface.co/genbio-ai/AIDO.StructureDecoder))
|
| 30 |
+
- Reconstructing Structures (See [below](#how-to-use))
|
| 31 |
- Structure Prediction (See [this section](https://huggingface.co/genbio-ai/AIDO.Protein2StructureToken-16B/blob/main/README.md#structure-prediction) in genbio-ai/AIDO.Protein2StructureToken-16B)
|
| 32 |
|
| 33 |
## Results
|
|
|
|
| 42 |
## How to Use
|
| 43 |
Please see `experiments/AIDO.StructureTokenizer` in [Model Generator](https://github.com/genbio-ai/modelgenerator) for more details.
|
| 44 |
|
| 45 |
+
### Setup
|
|
|
|
|
|
|
| 46 |
Install [Model Generator](https://github.com/genbio-ai/modelgenerator)
|
| 47 |
|
| 48 |
+
### Data preparation
|
| 49 |
|
| 50 |
To reproduce the reconstruction results in the paper, we provide a preprocessed CASP15 dataset at [genbio-ai/sample-structure-dataset](https://huggingface.co/datasets/genbio-ai/sample-structure-dataset). It could be downloaded via
|
| 51 |
```bash
|
|
|
|
| 73 |
|
| 74 |
You need to replace the `folder_path` and the `registry_path` in the following steps accordingly.
|
| 75 |
|
| 76 |
+
### Running Encoding and Decoding Task
|
| 77 |
|
| 78 |
If you use the provided CASP15 dataset, you can run the combined encoding and decoding task using the following command:
|
| 79 |
```bash
|
|
|
|
| 105 |
- Currently, this function only supports single GPU inference due to the file saving mechanism. We plan to support multi-GPU inference in the future.
|
| 106 |
- The reconstructed structures are aligned to the original structures using the Kabsch algorithm. This makes it easier to visualize and compare the structures.
|
| 107 |
|
| 108 |
+
### Visualizing the Reconstructed Structures
|
| 109 |
|
| 110 |
We use VS Code + [Protein Viewer Extension](https://marketplace.visualstudio.com/items?itemName=ArianJamasb.protein-viewer) to visualize the protein structures. It's a beginner-friendly tool for VS Code users. You could also use your preferred protein structure viewer to visualize the structures (e.g., PyMOL, ChimeraX, etc.), but here we focus on this extension.
|
| 111 |
|