Kquant03 commited on
Commit
a1f8505
·
1 Parent(s): 8007862

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -43,4 +43,9 @@ If all our tokens are sent to just a few popular experts, that will make trainin
43
 
44
 
45
  ## "Wait...but you called this a frankenMoE?"
46
- The difference between MoE and "frankenMoE" lies in the fact that the router layer in a model like the one on this repo is not trained simultaneously. There are rumors about someone developing a way for us to unscuff these frankenMoE models by training the router layer simultaneously. For now, frankenMoE remains psychotic...at least...until now.
 
 
 
 
 
 
43
 
44
 
45
  ## "Wait...but you called this a frankenMoE?"
46
+ The difference between MoE and "frankenMoE" lies in the fact that the router layer in a model like the one on this repo is not trained simultaneously. There are rumors about someone developing a way for us to unscuff these frankenMoE models by training the router layer simultaneously. For now, frankenMoE remains psychotic...at least...until now.
47
+
48
+ This model is probably the highest performing model on the site, but considering even I, the person who created it, only have 12 gigs of VRAM...only the truly insane will even be capable of controlling the Earth Render.
49
+
50
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/xayrIkbnNRJ4WJbdhhRsP.png)
51
+ ## this response took about 2 and a half hours lol...