0xSero's picture
Standardize model card (template rollout)
31405cf verified
metadata
base_model:
  - NousResearch/Hermes-3-Llama-3.1-8B
license: mit
pipeline_tag: text-generation
base_model_relation: finetune
library_name: transformers
tags:
  - nouscoder
  - sft

Support this work → · X · GitHub · REAP paper · Cerebras REAP

NousCoder-14B-SFT-Tools

SFT fine-tune of NousResearch/Hermes-3-Llama-3.1-8B.

At a glance

Base model NousResearch/Hermes-3-Llama-3.1-8B
Format SFT
Total params 14B
Active / token
Experts / layer
Layers
Hidden size
Context
On-disk size 1 GB

Which variant should I pick?

Variant Format Link
NousCoder-14B-SFT SFT link
NousCoder-14B-SFT-Tools (this) SFT link
NousCoder-14B-Tools Tools link

License & citation

License inherited from the base model.

@misc{lasby2025reap,
  title  = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression},
  author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
  year   = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv}
}

Sponsors

Made possible by NVIDIA · TNG Technology · Lambda · Prime Intellect · Hot Aisle.