Update README.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,27 @@ license: apache-2.0
|
|
| 8 |
[Github](https://github.com/Lyu6PosHao/ProLLaMA) for more information
|
| 9 |
|
| 10 |
ProLLaMA is based on Llama-2-7b, so please follow the license of Llama2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
# Quick usage:
|
| 12 |
```bash
|
| 13 |
# you can replace the model_path with your local path
|
|
@@ -109,22 +130,12 @@ if __name__ == '__main__':
|
|
| 109 |
print("All the outputs have been saved in",args.output_file)
|
| 110 |
```
|
| 111 |
|
| 112 |
-
#
|
| 113 |
-
The instructions which you input to the model should follow the following format:
|
| 114 |
-
```text
|
| 115 |
-
[Generate by superfamily] Superfamily=<xxx>
|
| 116 |
-
or
|
| 117 |
-
[Determine superfamily] Seq=<yyy>
|
| 118 |
-
```
|
| 119 |
-
Here are some examples of the input:
|
| 120 |
-
```text
|
| 121 |
-
[Generate by superfamily] Superfamily=<Ankyrin repeat-containing domain superfamily>
|
| 122 |
-
```
|
| 123 |
-
```
|
| 124 |
-
#You can also specify the first few amino acids of the protein sequence:
|
| 125 |
-
[Generate by superfamily] Superfamily=<Ankyrin repeat-containing domain superfamily> Seq=<MKRVL
|
| 126 |
-
```
|
| 127 |
-
```
|
| 128 |
-
[Determine superfamily] Seq=<MAPGGMPREFPSFVRTLPEADLGYPALRGWVLQGERGCVLYWEAVTEVALPEHCHAECWGVVVDGRMELMVDGYTRVYTRGDLYVVPPQARHRARVFPGFRGVEHLSDPDLLPVRKR>
|
| 129 |
```
|
| 130 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
[Github](https://github.com/Lyu6PosHao/ProLLaMA) for more information
|
| 9 |
|
| 10 |
ProLLaMA is based on Llama-2-7b, so please follow the license of Llama2.
|
| 11 |
+
|
| 12 |
+
# Input Format:
|
| 13 |
+
The instructions which you input to the model should follow the following format:
|
| 14 |
+
```text
|
| 15 |
+
[Generate by superfamily] Superfamily=<xxx>
|
| 16 |
+
or
|
| 17 |
+
[Determine superfamily] Seq=<yyy>
|
| 18 |
+
```
|
| 19 |
+
Here are some examples of the input:
|
| 20 |
+
```text
|
| 21 |
+
[Generate by superfamily] Superfamily=<Ankyrin repeat-containing domain superfamily>
|
| 22 |
+
```
|
| 23 |
+
```
|
| 24 |
+
#You can also specify the first few amino acids of the protein sequence:
|
| 25 |
+
[Generate by superfamily] Superfamily=<Ankyrin repeat-containing domain superfamily> Seq=<MKRVL
|
| 26 |
+
```
|
| 27 |
+
```
|
| 28 |
+
[Determine superfamily] Seq=<MAPGGMPREFPSFVRTLPEADLGYPALRGWVLQGERGCVLYWEAVTEVALPEHCHAECWGVVVDGRMELMVDGYTRVYTRGDLYVVPPQARHRARVFPGFRGVEHLSDPDLLPVRKR>
|
| 29 |
+
```
|
| 30 |
+
**See [this](https://github.com/Lyu6PosHao/ProLLaMA/blob/main/superfamilies.txt) on all the optional superfamilies.**
|
| 31 |
+
|
| 32 |
# Quick usage:
|
| 33 |
```bash
|
| 34 |
# you can replace the model_path with your local path
|
|
|
|
| 130 |
print("All the outputs have been saved in",args.output_file)
|
| 131 |
```
|
| 132 |
|
| 133 |
+
# Citation:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
```
|
| 135 |
+
@article{lv2024prollama,
|
| 136 |
+
title={ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing},
|
| 137 |
+
author={Lv, Liuzhenghao and Lin, Zongying and Li, Hao and Liu, Yuyang and Cui, Jiaxi and Chen, Calvin Yu-Chian and Yuan, Li and Tian, Yonghong},
|
| 138 |
+
journal={arXiv preprint arXiv:2402.16445},
|
| 139 |
+
year={2024}
|
| 140 |
+
}
|
| 141 |
+
```
|