phi0112358 's Collections

LLM: MoEs

GGUFs, conventional and k-quants – both without imatrix. This should be faster for CPU inference. Right now DeepSee MoEs (Mixture of Experts)