Support this work → · X · GitHub · REAP paper · Cerebras REAP

NousCoder-14B-SFT-Tools

SFT fine-tune of NousResearch/Hermes-3-Llama-3.1-8B.

At a glance

Base model NousResearch/Hermes-3-Llama-3.1-8B
Format SFT
Total params 14B
Active / token
Experts / layer
Layers
Hidden size
Context
On-disk size 1 GB

Which variant should I pick?

Variant Format Link
NousCoder-14B-SFT SFT link
NousCoder-14B-SFT-Tools (this) SFT link
NousCoder-14B-Tools Tools link

License & citation

License inherited from the base model.

@misc{lasby2025reap,
  title  = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression},
  author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
  year   = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv}
}

Sponsors

Made possible by NVIDIA · TNG Technology · Lambda · Prime Intellect · Hot Aisle.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for 0xSero/NousCoder-14B-SFT-Tools

Finetuned
(33)
this model

Collection including 0xSero/NousCoder-14B-SFT-Tools

Paper for 0xSero/NousCoder-14B-SFT-Tools