Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,11 @@ base_model:
|
|
| 22 |
|
| 23 |
This will be the real prototype, fingerprinting was the earlier thought and the full upcoming prototype is ready for train.
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
This marks the first use of a new prototype object dubbed AnchorBank, which is designed specifically to house the necessary implications that the model is distilled with,
|
| 28 |
while specifically aligning the expectation of those distillation valuations into the bank itself.
|
|
@@ -30,6 +34,11 @@ while specifically aligning the expectation of those distillation valuations int
|
|
| 30 |
This allows the model to POTENTIALLY solve nth token lookup without a head, so a head will allow finetuning. If successful, the anchor bank will contain
|
| 31 |
all the knowledge the model requires to geomewtrically represent it's data into expanded structures - if the losses and training process is correctly aligned to the task.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
# GEOLIP CaptionBERT-8192-fingerprinted
|
| 34 |
|
| 35 |
The next iteration will require an expanded fingerprinting axis-based relational bank, specifically to the alignment of the data and the teachers at training time.
|
|
|
|
| 22 |
|
| 23 |
This will be the real prototype, fingerprinting was the earlier thought and the full upcoming prototype is ready for train.
|
| 24 |
|
| 25 |
+
https://huggingface.co/AbstractPhil/geolip-axis-prototype
|
| 26 |
+
|
| 27 |
+
The example code and prototype axis modulators are present there as they are, and they will be utilized throughout upcoming experiments.
|
| 28 |
+
|
| 29 |
+
For CaptionBERT, upcoming checkpoints will push after the process is successful, likely 1 hour per epoch for 5 epochs or so should be more than eneough.
|
| 30 |
|
| 31 |
This marks the first use of a new prototype object dubbed AnchorBank, which is designed specifically to house the necessary implications that the model is distilled with,
|
| 32 |
while specifically aligning the expectation of those distillation valuations into the bank itself.
|
|
|
|
| 34 |
This allows the model to POTENTIALLY solve nth token lookup without a head, so a head will allow finetuning. If successful, the anchor bank will contain
|
| 35 |
all the knowledge the model requires to geomewtrically represent it's data into expanded structures - if the losses and training process is correctly aligned to the task.
|
| 36 |
|
| 37 |
+
**HOPEFULLY** after this refit, the structure will be capable of predicting NIL head token prediction, if not I'll work with a different small LLM project and then
|
| 38 |
+
determine the potential utility of direct integration of the two on a MOE pipeline instead of a full collective behavioral implication.
|
| 39 |
+
|
| 40 |
+
If that goes well, the MOE can be adapted into collective behavior if the systems align correctly, but that's a different process.
|
| 41 |
+
|
| 42 |
# GEOLIP CaptionBERT-8192-fingerprinted
|
| 43 |
|
| 44 |
The next iteration will require an expanded fingerprinting axis-based relational bank, specifically to the alignment of the data and the teachers at training time.
|