Update README.md
Browse files
README.md
CHANGED
|
@@ -59,9 +59,9 @@ This pretraining data will not be opened to public due to Twitter policy.
|
|
| 59 |
We train the data with 3 epochs and total steps of 296598 for 12 days.
|
| 60 |
The following are the results obtained from the training:
|
| 61 |
|
| 62 |
-
| train loss |
|
| 63 |
-
|
| 64 |
-
| 3.5057 | 3.0559 | 21.2398
|
| 65 |
|
| 66 |
## How to use
|
| 67 |
### Load model and tokenizer
|
|
|
|
| 59 |
We train the data with 3 epochs and total steps of 296598 for 12 days.
|
| 60 |
The following are the results obtained from the training:
|
| 61 |
|
| 62 |
+
| train loss | eval loss | eval perplexity |
|
| 63 |
+
|------------|------------|-----------------|
|
| 64 |
+
| 3.5057 | 3.0559 | 21.2398 |
|
| 65 |
|
| 66 |
## How to use
|
| 67 |
### Load model and tokenizer
|