Update README.md
Browse files
README.md
CHANGED
|
@@ -5,6 +5,34 @@ base_model:
|
|
| 5 |
pipeline_tag: zero-shot-classification
|
| 6 |
---
|
| 7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
Currently it's only a pickled early version at about ~50% accuracy.
|
| 9 |
|
| 10 |
This one is a 12 layer 8 head variation of max-vit-goliath that trained on geometric vocab with cifar100 using a specialized 5d format. It's WORKING - somewhat, but it's definitely nothing to phone home about yet.
|
|
|
|
| 5 |
pipeline_tag: zero-shot-classification
|
| 6 |
---
|
| 7 |
|
| 8 |
+
# Updated - Spark works.
|
| 9 |
+
|
| 10 |
+
max-vit-goliath-spark is essentially a 300k param vit that can handle nearly identical accuracy as the larger model with a shockingly robust utility of the features.
|
| 11 |
+
|
| 12 |
+
```PYTHON
|
| 13 |
+
'pentachora_spark': PentachoraConfig(
|
| 14 |
+
dim=64, depth=5, heads=4, mlp_ratio=4.0,
|
| 15 |
+
preserve_structure_until_layer=2,
|
| 16 |
+
dropout_rate=0.0, drop_path_rate=0.0
|
| 17 |
+
),
|
| 18 |
+
```
|
| 19 |
+
|
| 20 |
+
64 dim vocabulary effectively trying to carry the entire vit.
|
| 21 |
+
It's using a particularly effective geometric attention.
|
| 22 |
+
|
| 23 |
+
The output produces effective image feature representations in geomeric format.
|
| 24 |
+
|
| 25 |
+

|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
```
|
| 29 |
+
Final Results:
|
| 30 |
+
Best Validation Accuracy: 54.15%
|
| 31 |
+
Final Train Loss: 2.1262
|
| 32 |
+
Final Val Loss: 3.6396
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
# Original post
|
| 36 |
Currently it's only a pickled early version at about ~50% accuracy.
|
| 37 |
|
| 38 |
This one is a 12 layer 8 head variation of max-vit-goliath that trained on geometric vocab with cifar100 using a specialized 5d format. It's WORKING - somewhat, but it's definitely nothing to phone home about yet.
|