AbstractPhil commited on
Commit
e9e26aa
·
verified ·
1 Parent(s): 1a69aa3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -4,6 +4,14 @@ datasets:
4
  - AbstractPhil/geometric-vocab
5
  pipeline_tag: zero-shot-classification
6
  ---
 
 
 
 
 
 
 
 
7
  # I've had an epiphany. We don't NEED transformer layers in their current form.
8
 
9
  David's architecture already solved this need with high-efficiency multi-stage geometric mathematics.
 
4
  - AbstractPhil/geometric-vocab
5
  pipeline_tag: zero-shot-classification
6
  ---
7
+ # Formulas have been purified
8
+
9
+ The newest vit_zana_nano train has shown a very clean curve.
10
+
11
+ I've begun training a much deeper zana dubbed vit_zana_shaper. This model has 32 layers deep with MLP ratio of 1 and 2 attention heads, resting at about 3.5 million params or so.
12
+
13
+ Lets see how she fares.
14
+
15
  # I've had an epiphany. We don't NEED transformer layers in their current form.
16
 
17
  David's architecture already solved this need with high-efficiency multi-stage geometric mathematics.