sagawa
/

ReactionT5v2-yield

Model card Files Files and versions

sagawa commited on Sep 9, 2024

Commit

38930a7

·

verified ·

1 Parent(s): 585159a

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -95,15 +95,15 @@ print(model(inp)) # tensor([[19.1666]], grad_fn=<MulBackward0>)
 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-We used Open Reaction Database (ORD) dataset for model training.
-The command used for training is the following. For more information, please refer to the paper and GitHub repository.
 ```python
 python train.py \
-    --train_data_path='/home/acf15718oa/ReactionT5_neword/data/all_ord_reaction_uniq_with_attr20240506_v3_train.csv' \
-    --valid_data_path='/home/acf15718oa/ReactionT5_neword/data/all_ord_reaction_uniq_with_attr20240506_v3_valid.csv' \
-    --test_data_path='/home/acf15718oa/ReactionT5_neword/data/all_ord_reaction_uniq_with_attr20240506_v3_test.csv' \
-    --CN_test_data_path='/home/acf15718oa/ReactionT5_neword/data/C_N_yield/MFF_Test1/test.csv' \
     --epochs=100 \
     --batch_size=32 \
     --output_dir='./'

 ### Training Procedure
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+We used [Open Reaction Database (ORD) dataset](https://drive.google.com/file/d/1fa2MyLdN1vcA7Rysk8kLQENE92YejS9B/view?usp=drive_link) for model training. In addition, we used palladium-catalyzed Buchwald-Hartwig [C-N cross-coupling reactions dataset](https://yzhang.hpc.nyu.edu/T5Chem/index.html)'s test split to prevent data leakage.
+The command used for training is the following. For more information about data preprocessing and training, please refer to the paper and GitHub repository.
 ```python
 python train.py \
+    --train_data_path='../data/preprocessed_ord_train.csv' \
+    --valid_data_path='../data/preprocessed_ord_valid.csv' \
+    --test_data_path='../data/preprocessed_ord_test.csv' \
+    --CN_test_data_path='../data/C_N_yield/MFF_Test1/test.csv' \
     --epochs=100 \
     --batch_size=32 \
     --output_dir='./'