North-Mini-Code-1.0-43B-a5b

North-Mini-Code-1.0-43B-a5b is a 168-expert, top-14 sparse MoE expansion of CohereLabs/North-Mini-Code-1.0 for coding and agentic software-engineering workflows.

This upload is staged in an unfused Hugging Face-style MXFP4 layout. Expert weights remain as per-expert tensors with MXFP4 block and scale tensors instead of vLLM fused w13/w2 tensors.

Model Details

  • Architecture: Cohere2 MoE causal language model
  • Base model: CohereLabs/North-Mini-Code-1.0
  • Experts: 168 total, 14 active per token
  • Quantization: MXFP4 MoE expert weights
  • Layout: unfused per-expert MXFP4 safetensors index
  • Intended use: code generation, terminal workflows, and tool-use experiments

Runtime Note

This layout is closer to the original Hugging Face checkpoint semantics than a vLLM-fused export. Generic runtimes must understand the per-expert MXFP4 weight_blocks and weight_scales tensors. For native vLLM serving, use a validated runtime conversion or a vLLM build that supports this unfused MXFP4 layout.

License

Released under the Apache 2.0 license, following the base model.

Downloads last month
93
Safetensors
Model size
41B params
Tensor type
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LLMWildling/North-Mini-Code-1.0-43B-a5b

Quantized
(33)
this model