Vaishnav14220 commited on
Commit
1a40f18
·
1 Parent(s): 7303cbc

Add ORD forward reaction prediction demo app

Browse files
Files changed (4) hide show
  1. .gitignore +26 -0
  2. README.md +51 -6
  3. app.py +143 -0
  4. requirements.txt +6 -0
.gitignore ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ *.py[cod]
3
+ *$py.class
4
+ *.so
5
+ .Python
6
+ build/
7
+ develop-eggs/
8
+ dist/
9
+ downloads/
10
+ eggs/
11
+ .eggs/
12
+ lib/
13
+ lib64/
14
+ parts/
15
+ sdist/
16
+ var/
17
+ wheels/
18
+ *.egg-info/
19
+ .installed.cfg
20
+ *.egg
21
+ .env
22
+ .venv
23
+ env/
24
+ venv/
25
+ ENV/
26
+ .DS_Store
README.md CHANGED
@@ -1,13 +1,58 @@
1
  ---
2
- title: Demo Space Forward
3
- emoji: 📉
4
- colorFrom: gray
5
- colorTo: blue
6
  sdk: gradio
7
  sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
- license: mit
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: ORD Forward Reaction Prediction
3
+ emoji: 🧪
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
  sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
+ license: apache-2.0
11
+ models:
12
+ - smitathkr1/ord-forward-t5
13
+ datasets:
14
+ - smitathkr1/ord-reactions
15
  ---
16
 
17
+ # ORD Forward Reaction Prediction - T5 Model
18
+
19
+ This is a demo space for testing the `smitathkr1/ord-forward-t5` model, which predicts chemical reaction products from reactants.
20
+
21
+ ## Model Details
22
+
23
+ - **Model**: T5-based forward reaction prediction model
24
+ - **Training Data**: 2.3M reactions from the Open Reaction Database (ORD)
25
+ - **Training Status**: 5 epochs completed
26
+ - **Dataset**: [smitathkr1/ord-reactions](https://huggingface.co/datasets/smitathkr1/ord-reactions)
27
+
28
+ ## Usage
29
+
30
+ Enter reactants in SMILES format (e.g., `CC(C)N1CCN(C)CC1.Brc1ccccc1`) and the model will predict the product.
31
+
32
+ ### Input Format
33
+ - Reactants should be in SMILES notation
34
+ - Multiple reactants can be separated by '.'
35
+ - Example: `amine.aryl_halide` → `product`
36
+
37
+ ### Parameters
38
+ - **Max Length**: Maximum length of the generated product SMILES
39
+ - **Num Beams**: Number of beams for beam search (higher = more thorough search)
40
+ - **Temperature**: Sampling temperature (higher = more diverse outputs)
41
+
42
+ ## Examples
43
+
44
+ The model was trained on various reaction types from the ORD database, including:
45
+ - Buchwald-Hartwig amination reactions
46
+ - Cross-coupling reactions
47
+ - And many other organic reactions
48
+
49
+ ## Citation
50
+
51
+ If you use this model, please cite:
52
+ - The Open Reaction Database: https://open-reaction-database.org/
53
+ - Model: https://huggingface.co/smitathkr1/ord-forward-t5
54
+ - Dataset: https://huggingface.co/datasets/smitathkr1/ord-reactions
55
+
56
+ ## License
57
+
58
+ Apache 2.0
app.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
3
+ import torch
4
+
5
+ # Load the model and tokenizer
6
+ model_name = "smitathkr1/ord-forward-t5"
7
+ print(f"Loading model: {model_name}")
8
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
9
+ model = T5ForConditionalGeneration.from_pretrained(model_name)
10
+
11
+ # Set device
12
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
13
+ model.to(device)
14
+ model.eval()
15
+
16
+ def predict_reaction(reactants_smiles, max_length=150, num_beams=5, temperature=1.0):
17
+ """
18
+ Predict the product of a chemical reaction given reactants in SMILES format.
19
+
20
+ Args:
21
+ reactants_smiles: SMILES string of reactants (can be separated by '.')
22
+ max_length: Maximum length of generated sequence
23
+ num_beams: Number of beams for beam search
24
+ temperature: Sampling temperature
25
+
26
+ Returns:
27
+ Predicted product SMILES
28
+ """
29
+ try:
30
+ # Prepare input
31
+ input_text = reactants_smiles.strip()
32
+
33
+ # Tokenize input
34
+ inputs = tokenizer(
35
+ input_text,
36
+ return_tensors="pt",
37
+ max_length=512,
38
+ truncation=True,
39
+ padding=True
40
+ ).to(device)
41
+
42
+ # Generate prediction
43
+ with torch.no_grad():
44
+ outputs = model.generate(
45
+ inputs["input_ids"],
46
+ max_length=max_length,
47
+ num_beams=num_beams,
48
+ temperature=temperature,
49
+ early_stopping=True,
50
+ do_sample=False
51
+ )
52
+
53
+ # Decode output
54
+ predicted_product = tokenizer.decode(outputs[0], skip_special_tokens=True)
55
+
56
+ return predicted_product
57
+
58
+ except Exception as e:
59
+ return f"Error: {str(e)}"
60
+
61
+ # Example inputs from ORD dataset
62
+ examples = [
63
+ ["CC(C)N1CCN(C)CC1.Brc1ccccc1"], # Buchwald-Hartwig amination example
64
+ ["CCN1CCNCC1.Ic1ccccc1"], # Another coupling reaction
65
+ ["CC(=O)N1CCNCC1.Clc1ccccc1"], # Chloro coupling
66
+ ]
67
+
68
+ # Create Gradio interface
69
+ iface = gr.Interface(
70
+ fn=predict_reaction,
71
+ inputs=[
72
+ gr.Textbox(
73
+ label="Reactants (SMILES)",
74
+ placeholder="Enter reactants in SMILES format (e.g., CC(C)N1CCN(C)CC1.Brc1ccccc1)",
75
+ lines=2
76
+ ),
77
+ gr.Slider(
78
+ minimum=50,
79
+ maximum=300,
80
+ value=150,
81
+ step=10,
82
+ label="Max Length"
83
+ ),
84
+ gr.Slider(
85
+ minimum=1,
86
+ maximum=10,
87
+ value=5,
88
+ step=1,
89
+ label="Num Beams"
90
+ ),
91
+ gr.Slider(
92
+ minimum=0.1,
93
+ maximum=2.0,
94
+ value=1.0,
95
+ step=0.1,
96
+ label="Temperature"
97
+ )
98
+ ],
99
+ outputs=gr.Textbox(
100
+ label="Predicted Product (SMILES)",
101
+ lines=2
102
+ ),
103
+ examples=examples,
104
+ title="🧪 ORD Forward Reaction Prediction - T5 Model",
105
+ description="""
106
+ ## Forward Reaction Prediction using T5
107
+
108
+ This model predicts chemical reaction products from reactants using a T5 model trained on 2.3M reactions from the Open Reaction Database (ORD).
109
+
110
+ **Model:** `smitathkr1/ord-forward-t5` (5 epochs completed)
111
+
112
+ **Dataset:** [smitathkr1/ord-reactions](https://huggingface.co/datasets/smitathkr1/ord-reactions)
113
+
114
+ ### How to use:
115
+ 1. Enter reactants in SMILES format (separate multiple reactants with '.')
116
+ 2. Adjust generation parameters if needed
117
+ 3. Click Submit to get the predicted product
118
+
119
+ ### Example reactions:
120
+ - Buchwald-Hartwig amination reactions
121
+ - Various coupling reactions from the ORD database
122
+ """,
123
+ article="""
124
+ ### About the Model
125
+ This T5 model was trained on 2.3 million reactions from the Open Reaction Database.
126
+ The training has completed 5 epochs so far.
127
+
128
+ ### Citation
129
+ If you use this model, please cite the Open Reaction Database:
130
+ - [Open Reaction Database](https://open-reaction-database.org/)
131
+
132
+ ### Notes
133
+ - Input should be valid SMILES strings
134
+ - The model predicts forward reactions (reactants → products)
135
+ - Adjust beam search parameters for different prediction strategies
136
+ """,
137
+ theme=gr.themes.Soft(),
138
+ allow_flagging="never"
139
+ )
140
+
141
+ # Launch the app
142
+ if __name__ == "__main__":
143
+ iface.launch()
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ transformers>=4.30.0
2
+ torch>=2.0.0
3
+ gradio>=4.0.0
4
+ sentencepiece
5
+ protobuf
6
+ accelerate