nuriamimbreropelegri commited on
Commit
d4218c2
·
verified ·
1 Parent(s): 8b5718b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -8
README.md CHANGED
@@ -60,13 +60,6 @@ However, since we are tokenizing the reaction SMILES on a character level,
60
  the model has learnt dependencies among molecules and enzyme sequence features, and it can transfer learning from more to less populated
61
  reaction classes.
62
 
63
- ## **Model Performance**
64
-
65
- - **Dataset curation**
66
- We converted the reactions from rxn format to smile string including only left-to-right reactions.
67
- The enzyme sequences were truncated to 1024.
68
- Enzymes catalyzing more than one reaction appear in multiple enzyme-reaction pairs.
69
-
70
 
71
 
72
  ## **How to generate from REXzyme**
@@ -80,6 +73,10 @@ Usually the BLEU score is deployed for translation evaluation,
80
  but this score would enforce a high sequence similarity (thus not *de novo* design, which is what we tend to go for).
81
  We recommend generating many sequences and selecting them by plDDT, as well as other metrics.
82
 
 
 
 
 
83
  ```python
84
  """Inference on a SMILES txt. Saved as fastas
85
  Previously called generate_comparison"""
@@ -91,7 +88,7 @@ if __name__ == '__main__':
91
  import torch
92
  import json
93
 
94
- parser = argparse.ArgumentParser(description='Mol2Pro inference',
95
  formatter_class=argparse.ArgumentDefaultsHelpFormatter)
96
  parser.add_argument('--input_file', default='../inference/random_smiles2.txt', type=str,
97
  help='File with the input molecule SMILES')
 
60
  the model has learnt dependencies among molecules and enzyme sequence features, and it can transfer learning from more to less populated
61
  reaction classes.
62
 
 
 
 
 
 
 
 
63
 
64
 
65
  ## **How to generate from REXzyme**
 
73
  but this score would enforce a high sequence similarity (thus not *de novo* design, which is what we tend to go for).
74
  We recommend generating many sequences and selecting them by plDDT, as well as other metrics.
75
 
76
+ Before running the inference script, one should create a text file containing the desired input SMILE. Note that if there are multiple reactions SMILE in the same file
77
+ but in separate lines, the model will generate sequences for each reaction independently, creating different a different output file for each of them.
78
+
79
+
80
  ```python
81
  """Inference on a SMILES txt. Saved as fastas
82
  Previously called generate_comparison"""
 
88
  import torch
89
  import json
90
 
91
+ parser = argparse.ArgumentParser(description='inference',
92
  formatter_class=argparse.ArgumentDefaultsHelpFormatter)
93
  parser.add_argument('--input_file', default='../inference/random_smiles2.txt', type=str,
94
  help='File with the input molecule SMILES')