Update README.md
Browse files
README.md
CHANGED
|
@@ -40,5 +40,6 @@ All notebooks pull pre-tokenized data from prism-lab/wikitext-103-prism-32k-seq4
|
|
| 40 |
All models use tied embeddings (input embeddings = output projection weights). Checkpoint files contain duplicated weights for compatibility. Evaluation scripts redefine model classes with proper weight tying before loading.
|
| 41 |
|
| 42 |
## Additional Notes
|
| 43 |
-
Phase exploration on embeddings will be replaced soon with stronger version which includes statistical
|
|
|
|
| 44 |
Causal ablation codes and models will be added soon.
|
|
|
|
| 40 |
All models use tied embeddings (input embeddings = output projection weights). Checkpoint files contain duplicated weights for compatibility. Evaluation scripts redefine model classes with proper weight tying before loading.
|
| 41 |
|
| 42 |
## Additional Notes
|
| 43 |
+
Phase exploration on embeddings will be replaced soon with stronger version which includes statistical analysis.
|
| 44 |
+
|
| 45 |
Causal ablation codes and models will be added soon.
|