Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,7 @@ This is a repo specifically dedicated to analyzing embedding spaces for perfect
|
|
| 6 |
We're going to get to the bottom of why embeddings are so symmetrically sample-sturdy and can be assessed with CV deterministically on random,
|
| 7 |
and with that compare the deconstructive nature of collapsing embeddings into more unified compacted spaces for MHA heads.
|
| 8 |
|
| 9 |
-
The bands show; D16 and
|
| 10 |
full degredation and breakdown using LINALG.det
|
| 11 |
|
| 12 |
Generally the rule of thumb for MHA head counts is 64 dims per head and scale from there. I'm thinking there may be a direct causal response
|
|
|
|
| 6 |
We're going to get to the bottom of why embeddings are so symmetrically sample-sturdy and can be assessed with CV deterministically on random,
|
| 7 |
and with that compare the deconstructive nature of collapsing embeddings into more unified compacted spaces for MHA heads.
|
| 8 |
|
| 9 |
+
The bands show; D16 and D36 are directly CV sample-capable for volume validity, 64 dim being the utmost upper bounds for distance before
|
| 10 |
full degredation and breakdown using LINALG.det
|
| 11 |
|
| 12 |
Generally the rule of thumb for MHA head counts is 64 dims per head and scale from there. I'm thinking there may be a direct causal response
|