0xSero commited on
Commit
b6e280b
·
verified ·
1 Parent(s): 26d6276

Standardize model card (template rollout)

Browse files
Files changed (1) hide show
  1. README.md +45 -24
README.md CHANGED
@@ -1,14 +1,43 @@
1
  ---
2
  library_name: transformers
3
- tags: []
4
- base_model: NousResearch/Hermes-3-Llama-3.1-8B
 
 
 
 
 
 
5
  ---
 
6
  > [!TIP]
7
- > Support this work: **[donate.sybilsolutions.ai](https://donate.sybilsolutions.ai)**
8
- >
9
- > REAP surfaces: [GLM](https://huggingface.co/spaces/0xSero/reap-glm-family) | [MiniMax](https://huggingface.co/spaces/0xSero/reap-minimax-family) | [Qwen](https://huggingface.co/spaces/0xSero/reap-qwen-family) | [Gemma](https://huggingface.co/spaces/0xSero/reap-gemma-family) | [Paper](https://arxiv.org/abs/2510.13999) | [Code](https://github.com/CerebrasResearch/reap) | [PR17](https://github.com/CerebrasResearch/reap/pull/17) | [Cerebras Collection](https://huggingface.co/collections/cerebras/cerebras-reap)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- # Model Card for Model ID
 
 
 
 
 
 
12
 
13
  <!-- Provide a quick summary of what the model is/does. -->
14
 
@@ -203,24 +232,16 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
203
 
204
  [More Information Needed]
205
 
206
- ## Support
207
-
208
- If this work is useful, support Sybil Solutions here: [https://donate.sybilsolutions.ai](https://donate.sybilsolutions.ai)
209
-
210
 
211
- <!-- SERO_MANAGED_TOP_LINKS_START -->
212
- ## Support and links
213
- - Donate: https://donate.sybilsolutions.ai
214
- - X: https://x.com/0xsero
215
- - GitHub: https://github.com/0xsero
216
- <!-- SERO_MANAGED_TOP_LINKS_END -->
 
217
 
218
  ## Sponsors
219
-
220
- Thank you for the kind sponsors, wouldn't be possible without them:
221
-
222
- - Nvidia
223
- - TNG Technology
224
- - Lambda
225
- - Prime Intellect
226
- - HotAisle
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - nouscoder
5
+ - tools
6
+ base_model:
7
+ - NousResearch/Hermes-3-Llama-3.1-8B
8
+ license: mit
9
+ pipeline_tag: text-generation
10
+ base_model_relation: finetune
11
  ---
12
+
13
  > [!TIP]
14
+ > **[Support this work ](https://donate.sybilsolutions.ai)** · [X](https://x.com/0xsero) · [GitHub](https://github.com/0xsero) · [REAP paper](https://arxiv.org/abs/2510.13999) · [Cerebras REAP](https://huggingface.co/collections/cerebras/cerebras-reap)
15
+
16
+ # NousCoder-14B-Tools
17
+
18
+ Tools fine-tune of [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B).
19
+
20
+ ## At a glance
21
+
22
+ | | |
23
+ |---|---|
24
+ | Base model | [NousResearch/Hermes-3-Llama-3.1-8B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B) |
25
+ | Format | Tools |
26
+ | Total params | **14B** |
27
+ | Active / token | — |
28
+ | Experts / layer | — |
29
+ | Layers | — |
30
+ | Hidden size | — |
31
+ | Context | — |
32
+ | On-disk size | 1 GB |
33
 
34
+ ## Which variant should I pick?
35
+
36
+ | Variant | Format | Link |
37
+ |---|---|---|
38
+ | `NousCoder-14B-SFT` | SFT | [link](https://huggingface.co/0xSero/NousCoder-14B-SFT) |
39
+ | `NousCoder-14B-SFT-Tools` | SFT | [link](https://huggingface.co/0xSero/NousCoder-14B-SFT-Tools) |
40
+ | `NousCoder-14B-Tools` **(this)** | Tools | [link](https://huggingface.co/0xSero/NousCoder-14B-Tools) |
41
 
42
  <!-- Provide a quick summary of what the model is/does. -->
43
 
 
232
 
233
  [More Information Needed]
234
 
235
+ ## License & citation
236
+ License inherited from the base model.
 
 
237
 
238
+ ```bibtex
239
+ @misc{lasby2025reap,
240
+ title = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression},
241
+ author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
242
+ year = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv}
243
+ }
244
+ ```
245
 
246
  ## Sponsors
247
+ Made possible by **NVIDIA · TNG Technology · Lambda · Prime Intellect · Hot Aisle**.