Update README.md
Browse files
README.md
CHANGED
|
@@ -18,6 +18,20 @@ base_model:
|
|
| 18 |
- AbstractPhil/geolip-bertenstein
|
| 19 |
---
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
# GEOLIP CaptionBERT-8192
|
| 22 |
|
| 23 |
A 26M-parameter caption encoder whose embedding space is the geometric intersection of five independently trained language models. Trained from scratch via consensus distillation — no pretrained weights, no expert models at inference.
|
|
|
|
| 18 |
- AbstractPhil/geolip-bertenstein
|
| 19 |
---
|
| 20 |
|
| 21 |
+
# GEOLIP CaptionBERT-8192-fingerprinted
|
| 22 |
+
|
| 23 |
+
The next iteration will require an expanded fingerprinted specifically to the alignment of the data and the teachers at training time.
|
| 24 |
+
|
| 25 |
+
The differentiation between what is learned and what is retained specifically expert-to-expert will enable this fingerprint to preserve the student model's integrity,
|
| 26 |
+
which should allow cross_entropy training without complete geometric collapse and rapid overffiting.
|
| 27 |
+
|
| 28 |
+
As it stands this model is too rigid to train heads on, but I will directly improve it today and instill a core memory of geometry.
|
| 29 |
+
|
| 30 |
+
This geometry will be ever-learning, meaning when the core model trains from any experts, the bank must train as well. This geometry houses the entire
|
| 31 |
+
internalized geometric embedding anchored fingerprinting spectrum, and this will likely evolve over the coming hours until the functional prototype comes
|
| 32 |
+
to full fruition. Wish me luck as I design the reusable compact mechanism.
|
| 33 |
+
|
| 34 |
+
|
| 35 |
# GEOLIP CaptionBERT-8192
|
| 36 |
|
| 37 |
A 26M-parameter caption encoder whose embedding space is the geometric intersection of five independently trained language models. Trained from scratch via consensus distillation — no pretrained weights, no expert models at inference.
|