Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,8 @@ base_model:
|
|
| 26 |
|
| 27 |
|
| 28 |
|
| 29 |
-
# Newest: Prepping 12m conceptual-captions bert extractions aka 36m
|
|
|
|
| 30 |
|
| 31 |
The dataset is going to be in pt chunks because they load directly to vram nearly instantly in colab, and the system operates on them quicker than dataloaders.
|
| 32 |
|
|
|
|
| 26 |
|
| 27 |
|
| 28 |
|
| 29 |
+
# Newest: Prepping 12m conceptual-captions bert extractions aka 36m extractions * 5 models
|
| 30 |
+
So around, 180,000,000 total samples.
|
| 31 |
|
| 32 |
The dataset is going to be in pt chunks because they load directly to vram nearly instantly in colab, and the system operates on them quicker than dataloaders.
|
| 33 |
|