Update README.md
Browse files
README.md
CHANGED
|
@@ -50,14 +50,14 @@ https://github.com/ramongougis/WaveletLM
|
|
| 50 |

|
| 51 |
|
| 52 |
## Training
|
| 53 |
-
- Trained on a single RTX 5090 for 5 epochs
|
| 54 |
-
-
|
|
|
|
| 55 |
- VRAM required: 18.3 GB.
|
| 56 |
- Time to train: 16 hours 15 minutes.
|
| 57 |
-
- PG-19 run in progress.
|
| 58 |
|
| 59 |
## Generation
|
| 60 |
-
|
| 61 |
- Can set `compile:false` to save 0.5-1 GB, but it's slower.
|
| 62 |
- 28.8 tokens/s. on a 5090 by default.
|
| 63 |
- Future enhancements expected to increase speed by up to 120%.
|
|
|
|
| 50 |

|
| 51 |
|
| 52 |
## Training
|
| 53 |
+
- Trained on a single RTX 5090 for 5 epochs
|
| 54 |
+
- WikiText-103: best PPL of 23.749 with mean PPL of 23.818 across 3 seeds.
|
| 55 |
+
- PG-19: PPL of 27.40 (single seed).
|
| 56 |
- VRAM required: 18.3 GB.
|
| 57 |
- Time to train: 16 hours 15 minutes.
|
|
|
|
| 58 |
|
| 59 |
## Generation
|
| 60 |
+
- VRAM: 5.0 GB by default, 4.5 GB with `--ptq8` enabled.
|
| 61 |
- Can set `compile:false` to save 0.5-1 GB, but it's slower.
|
| 62 |
- 28.8 tokens/s. on a 5090 by default.
|
| 63 |
- Future enhancements expected to increase speed by up to 120%.
|