Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -18,19 +18,23 @@ base_model:
 - AbstractPhil/geolip-bertenstein
 ---
-# Newest: Prepping 12m conceptual-captions bert extractions
 The dataset is going to be in pt chunks because they load directly to vram nearly instantly in colab, and the system operates on them quicker than dataloaders.
-I'll be running the full 12m set, no exceptions - short llava, long llava, and original captions.
-After the 12m 5 expert dataset training completes, the core model will be ready.
 It's legitimately wild watching the system sit there at 100% accuracy validation, but it requires additional complexity so that isn't the measure to analyze.
 The problem is solved for recall, but the internal structure's geometric system needs to align to the larger spectrum of rigidity that the smooth manifold
 deviations require to create a full cohesion, meaning more data.
-12 million samples roughly 10 epochs should be a fair assessment. Hopefully the data isn't too much.
 # 2 additional epochs, 1m samples ran

 - AbstractPhil/geolip-bertenstein
 ---
+# Newest: Prepping 12m conceptual-captions bert extractions aka 36m full extractions
 The dataset is going to be in pt chunks because they load directly to vram nearly instantly in colab, and the system operates on them quicker than dataloaders.
+I'll be running the full 12m set on all three captions, no exceptions - short llava, long llava, and original captions.
+After the 36m 5 expert dataset training completes, the core model will be ready.
 It's legitimately wild watching the system sit there at 100% accuracy validation, but it requires additional complexity so that isn't the measure to analyze.
 The problem is solved for recall, but the internal structure's geometric system needs to align to the larger spectrum of rigidity that the smooth manifold
 deviations require to create a full cohesion, meaning more data.
+36 million samples roughly 10 epochs should be a fair assessment. Hopefully the data isn't too much.
+Saturating the internals of the anchor and the subsystem will allow for more complex processes and easy alignment with pieces of the data. After that
+it will be quite fast to sample the most accurate captions and begin forming vit association, which will allow for a full next token prediction capacity
+thanks to the internal similarity mechanisms and the formed in steel anchor bank's solidity.
 # 2 additional epochs, 1m samples ran