AbstractPhil commited on
Commit
fa8374e
·
verified ·
1 Parent(s): 6710926

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -20,18 +20,24 @@ base_model:
20
 
21
  # Newest: NLI head preliminary
22
 
23
- So the outcome is surprisingly good. Around 75% accuracy give or take for NLI with the prototype conv5d that I'm working with.
24
 
25
  It's not a true conv5d, more an accumulator meant to encapsulate the necessary behavioral implications of every stretch that implicates the potential for a conv5d.
26
  This structure when paired with structural geometry defeats the MLP in terms of overfitting.
27
 
28
  MLP managed to overfit the model at around 71% or so, to nearly 95% training accuracy.
29
 
30
- Conv5d managed to preserve the bank's geometry instead of completely collapsing it into noise, allowing around 75% with 80% training accuracy.
31
 
32
  So the problem is still present. Without the bank the model reached around 64% or so, with the anchor bank the model reaches around 75% with geometric solidity but it's not
33
  enough to say the NLI head works yet.
34
 
 
 
 
 
 
 
35
  HOWEVER, it's enough to say that it can with more training. This is enough to continue for me.
36
 
37
  If I were to say unlock the model's weights and train ALL FIVE EXPERTS, this would be an arbitrary task. The system would learn it instantly.
 
20
 
21
  # Newest: NLI head preliminary
22
 
23
+ So the outcome is surprisingly good. Around 75-76% accuracy give or take for NLI with the prototype conv5d that I'm working with.
24
 
25
  It's not a true conv5d, more an accumulator meant to encapsulate the necessary behavioral implications of every stretch that implicates the potential for a conv5d.
26
  This structure when paired with structural geometry defeats the MLP in terms of overfitting.
27
 
28
  MLP managed to overfit the model at around 71% or so, to nearly 95% training accuracy.
29
 
30
+ Conv5d managed to preserve the bank's geometry instead of completely collapsing it into noise, allowing around 75-76% with 80% training accuracy.
31
 
32
  So the problem is still present. Without the bank the model reached around 64% or so, with the anchor bank the model reaches around 75% with geometric solidity but it's not
33
  enough to say the NLI head works yet.
34
 
35
+ As you can see either way, the model does eventually begin to overfit in it's current state. It's simply too small and predominantly distilled,
36
+ which means it will have problems no matter which task I attempt to teach this model.
37
+
38
+
39
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/630cf55b15433862cfc9556f/Yudr9ZF_bWx-8xAbSqGDr.png)
40
+
41
  HOWEVER, it's enough to say that it can with more training. This is enough to continue for me.
42
 
43
  If I were to say unlock the model's weights and train ALL FIVE EXPERTS, this would be an arbitrary task. The system would learn it instantly.