A suit of multilingual MoE models with highly-sparse architectures
-
AIDC-AI/Marco-Nano-Base
Text Generation β’ 8B β’ Updated β’ 264 β’ 13 -
AIDC-AI/Marco-Mini-Base
Text Generation β’ 17B β’ Updated β’ 275 β’ 6 -
AIDC-AI/Marco-Mini-Global-Base
Text Generation β’ 17B β’ Updated β’ 230 β’ 6 -
AIDC-AI/Marco-Nano-Instruct
Text Generation β’ 8B β’ Updated β’ 2.23k β’ 34