Update README.md
Browse files
README.md
CHANGED
|
@@ -95,15 +95,15 @@ print(model(inp)) # tensor([[19.1666]], grad_fn=<MulBackward0>)
|
|
| 95 |
### Training Procedure
|
| 96 |
|
| 97 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 98 |
-
We used Open Reaction Database (ORD) dataset for model training.
|
| 99 |
-
The command used for training is the following. For more information, please refer to the paper and GitHub repository.
|
| 100 |
|
| 101 |
```python
|
| 102 |
python train.py \
|
| 103 |
-
--train_data_path='
|
| 104 |
-
--valid_data_path='
|
| 105 |
-
--test_data_path='
|
| 106 |
-
--CN_test_data_path='
|
| 107 |
--epochs=100 \
|
| 108 |
--batch_size=32 \
|
| 109 |
--output_dir='./'
|
|
|
|
| 95 |
### Training Procedure
|
| 96 |
|
| 97 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 98 |
+
We used [Open Reaction Database (ORD) dataset](https://drive.google.com/file/d/1fa2MyLdN1vcA7Rysk8kLQENE92YejS9B/view?usp=drive_link) for model training. In addition, we used palladium-catalyzed Buchwald-Hartwig [C-N cross-coupling reactions dataset](https://yzhang.hpc.nyu.edu/T5Chem/index.html)'s test split to prevent data leakage.
|
| 99 |
+
The command used for training is the following. For more information about data preprocessing and training, please refer to the paper and GitHub repository.
|
| 100 |
|
| 101 |
```python
|
| 102 |
python train.py \
|
| 103 |
+
--train_data_path='../data/preprocessed_ord_train.csv' \
|
| 104 |
+
--valid_data_path='../data/preprocessed_ord_valid.csv' \
|
| 105 |
+
--test_data_path='../data/preprocessed_ord_test.csv' \
|
| 106 |
+
--CN_test_data_path='../data/C_N_yield/MFF_Test1/test.csv' \
|
| 107 |
--epochs=100 \
|
| 108 |
--batch_size=32 \
|
| 109 |
--output_dir='./'
|