Standardize model card (template rollout)

31405cf verified 3 days ago

1.75 kB

base_model:
  - NousResearch/Hermes-3-Llama-3.1-8B
license: mit
pipeline_tag: text-generation
base_model_relation: finetune
library_name: transformers
tags:
  - nouscoder
  - sft

Support this work → · X · GitHub · REAP paper · Cerebras REAP

NousCoder-14B-SFT-Tools

SFT fine-tune of NousResearch/Hermes-3-Llama-3.1-8B.

At a glance


Base model	NousResearch/Hermes-3-Llama-3.1-8B
Format	SFT
Total params	14B
Active / token	—
Experts / layer	—
Layers	—
Hidden size	—
Context	—
On-disk size	1 GB

Which variant should I pick?

Variant	Format	Link
`NousCoder-14B-SFT`	SFT	link
`NousCoder-14B-SFT-Tools` (this)	SFT	link
`NousCoder-14B-Tools`	Tools	link

License & citation

License inherited from the base model.

@misc{lasby2025reap,
  title  = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression},
  author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
  year   = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv}
}

0xSero
/

NousCoder-14B-SFT-Tools

NousCoder-14B-SFT-Tools

At a glance

Which variant should I pick?

License & citation

Sponsors