Update README.md
Browse files
README.md
CHANGED
|
@@ -18,6 +18,14 @@
|
|
| 18 |
|y | weight decay | 1e-5 |
|
| 19 |
|iter | iterations | 570000 |
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
- Note that you can adjust the batch size and accumulation steps based on your GPU memory. But, the batch size * accumulation steps should be 128.
|
| 22 |
- If you finetune your models with multiple GPUs, you can turn down accumulation steps. For example, if you finetune with 2 GPUs, you will need to half the accumulation steps.
|
| 23 |
|
|
|
|
| 18 |
|y | weight decay | 1e-5 |
|
| 19 |
|iter | iterations | 570000 |
|
| 20 |
|
| 21 |
+
## Model files
|
| 22 |
+
|
| 23 |
+
| Filename | Description |
|
| 24 |
+
| ------- | ------- |
|
| 25 |
+
|ckpt.pt|A model file for finetuning|
|
| 26 |
+
|ckpt_base.pt | A model file for generating syntax tree with the error correction in zero-shot setting|
|
| 27 |
+
|ckpt_finetune.pt | A model finetuned with the syntatic error dataset |
|
| 28 |
+
|
| 29 |
- Note that you can adjust the batch size and accumulation steps based on your GPU memory. But, the batch size * accumulation steps should be 128.
|
| 30 |
- If you finetune your models with multiple GPUs, you can turn down accumulation steps. For example, if you finetune with 2 GPUs, you will need to half the accumulation steps.
|
| 31 |
|