Insert [your mom so fat] joke here
Wow this model is huge. Nice work though, I love the outputs. Looking forward to running this sometime in 2035.
Maybe some brave soul will distill this down to ~70B?
Maybe some brave soul will distill this down to ~70B?
I thought the whole purpose was to make this big?
Wow this model is huge. Nice work though, I love the outputs. Looking forward to running this sometime in 2035.
Maybe some brave soul will distill this down to ~70B?
Is that a challenge?
I can cut it down for whatever task you want. Just tell me the category. You can see our other models on my profile
Is that a challenge?
I can cut it down for whatever task you want. Just tell me the category. You can see our other models on my profile
Cool, I assume your method is dataset calibration, similar to the REAP method?
Creative writing is the main focus here.
It's called Unstructued Sparsity.
Yes, we can do writing. Don't know the ETA though π
We must replicate this on Kimi K2.5 and name it Kimi K3.5
lol >_>
But I dont have nearly enough compute for that, just a 5090, good CPU, and loads storage.
Nobodexistsontheinternet should make it.
You probably can if you do it sharded
Hmmm
It looks like he just merged models, Im pruning them.