Ramon Gougis commited on
Commit
48c99a4
·
verified ·
1 Parent(s): 31e4e3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -51,7 +51,18 @@ https://github.com/ramongougis/WaveletLM
51
  ![WaveletLM architecture](https://raw.githubusercontent.com/ramongougis/WaveletLM/main/assets/waveletlm-architecture.svg)
52
 
53
  ## Training
54
- Trained on a single RTX 5090 for 5 epochs on WikiText-103 (best of 3 seeds: 1337, 42, 7). Best validation loss: 3.16. PG-19 weights also included (1-epoch run; longer training planned post-release).
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  See <a href="https://github.com/ramongougis/WaveletLM/blob/main/runs.md">runs.md</a> for the full training history.
57
 
 
51
  ![WaveletLM architecture](https://raw.githubusercontent.com/ramongougis/WaveletLM/main/assets/waveletlm-architecture.svg)
52
 
53
  ## Training
54
+ - Trained on a single RTX 5090 for 5 epochs on WikiText-103.
55
+ - Best PPL of 23.749 with mean PPL of 23.818 across 3 seeds.
56
+ - VRAM required: 18.3 GB.
57
+ - Time to train: 16 hours 15 minutes.
58
+
59
+ ## Generation
60
+ - VRAM required: 5.0 GB by default, 4.5 GB with PTQ enabled.
61
+ - Can set `compile:false` to save 0.5-1 GB, but it's slower.
62
+ - 28.8 tokens/s. on a 5090 by default.
63
+ - Future enhancements expected to lower VRAM required and increase tokens/s by up to 120%.
64
+ - WikiText-103 weights available now.
65
+ - PG-19 1-epoch run is in progress.
66
 
67
  See <a href="https://github.com/ramongougis/WaveletLM/blob/main/runs.md">runs.md</a> for the full training history.
68