Ramon Gougis commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -51,7 +51,18 @@ https://github.com/ramongougis/WaveletLM
|
|
| 51 |

|
| 52 |
|
| 53 |
## Training
|
| 54 |
-
Trained on a single RTX 5090 for 5 epochs on WikiText-103
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
See <a href="https://github.com/ramongougis/WaveletLM/blob/main/runs.md">runs.md</a> for the full training history.
|
| 57 |
|
|
|
|
| 51 |

|
| 52 |
|
| 53 |
## Training
|
| 54 |
+
- Trained on a single RTX 5090 for 5 epochs on WikiText-103.
|
| 55 |
+
- Best PPL of 23.749 with mean PPL of 23.818 across 3 seeds.
|
| 56 |
+
- VRAM required: 18.3 GB.
|
| 57 |
+
- Time to train: 16 hours 15 minutes.
|
| 58 |
+
|
| 59 |
+
## Generation
|
| 60 |
+
- VRAM required: 5.0 GB by default, 4.5 GB with PTQ enabled.
|
| 61 |
+
- Can set `compile:false` to save 0.5-1 GB, but it's slower.
|
| 62 |
+
- 28.8 tokens/s. on a 5090 by default.
|
| 63 |
+
- Future enhancements expected to lower VRAM required and increase tokens/s by up to 120%.
|
| 64 |
+
- WikiText-103 weights available now.
|
| 65 |
+
- PG-19 1-epoch run is in progress.
|
| 66 |
|
| 67 |
See <a href="https://github.com/ramongougis/WaveletLM/blob/main/runs.md">runs.md</a> for the full training history.
|
| 68 |
|