kk497055 commited on
Commit
32d75d7
·
verified ·
1 Parent(s): ef5ca9f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -28
README.md CHANGED
@@ -1,53 +1,117 @@
1
  ---
 
 
2
  license: apache-2.0
3
  tags:
4
- - chimera
5
- - moe
6
- - mixture-of-experts
7
- - gguf
8
- - klyrone
9
- language:
10
- - en
 
 
 
 
 
11
  pipeline_tag: text-generation
 
 
12
  ---
13
 
14
  # Chimera 47B
15
 
16
- **Chimera** is a Mixture-of-Experts language model developed by [Klyrone F.Z.E](https://klyrone.com), built using our proprietary **Amalgamation of Experts (AoE)** technique. Chimera features 8 specialized expert networks with top-2 routing, delivering strong instruction-following and reasoning capabilities with an efficient 12.9B active parameter footprint.
17
 
18
- ## Model Details
 
 
 
 
 
 
19
 
20
  | | |
21
  |---|---|
22
- | **Architecture** | Mixture of Experts (MoE) - 8 experts, top-2 routing |
23
- | **Total Parameters** | 46.7B |
24
- | **Active Parameters** | 12.9B per token |
25
- | **Context Length** | 32,768 tokens |
26
- | **Quantization** | Q5_K_M (GGUF) |
27
- | **Developed by** | Klyrone F.Z.E. |
 
 
 
 
 
 
 
28
 
29
- ## Key Features
 
 
 
 
30
 
31
- - **Efficient MoE Architecture** - Only 12.9B parameters active per forward pass despite 46.7B total, enabling fast inference
32
- - **Specialized Expert Networks** - 8 expert FFN modules with learned routing for task-adaptive computation
33
- - **Instruction-Tuned Experts** - Expert networks optimized for instruction following, code generation, and reasoning
34
- - **Long Context** - Supports up to 32K token context windows with RoPE positional encoding
35
 
36
- ## Amalgamation of Experts (AoE)
37
 
38
- Chimera is built using our **AoE** technique, a novel approach to constructing high-quality MoE models by strategically assembling expert networks. AoE enables the creation of models that combine specialized capabilities from multiple training paradigms into a unified, coherent architecture.
 
 
 
 
39
 
40
  ## Usage
41
 
42
- ### With llama.cpp
 
43
  ```bash
44
- ./llama-cli -m Chimera-8x7B-Q5_K_M.gguf -p "Your prompt here" -n 500 -ngl 99
 
 
 
 
45
  ```
46
 
47
- ## Intended Use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
- Chimera is designed for general-purpose text generation including conversational AI, code generation, reasoning, and instruction following.
 
 
50
 
51
- ## License
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
- Apache 2.0
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
  tags:
6
+ - mixtral
7
+ - moe
8
+ - mixture-of-experts
9
+ - merged
10
+ - chimera
11
+ - klyrone
12
+ - instruct
13
+ - text-generation
14
+ base_model:
15
+ - mistralai/Mixtral-8x7B-v0.1
16
+ - mistralai/Mixtral-8x7B-Instruct-v0.1
17
+ model_type: mixtral
18
  pipeline_tag: text-generation
19
+ library_name: transformers
20
+ inference: true
21
  ---
22
 
23
  # Chimera 47B
24
 
25
+ **Klyrone F.Z.E.** · March 2026 · Apache 2.0
26
 
27
+ Chimera 47B is a 46.7B parameter Mixture-of-Experts language model built using Klyrone's MoE assembly framework. It delivers instruction-following, code generation, and reasoning at 154 tokens/second on H200 hardware, with only 12.9B parameters active per token.
28
+
29
+ A technical paper detailing the methodology is forthcoming.
30
+
31
+ ---
32
+
33
+ ## Key Numbers
34
 
35
  | | |
36
  |---|---|
37
+ | Total Parameters | 46.7 B |
38
+ | Active / Token | 12.9 B |
39
+ | Architecture | MoE · 8 experts · top-2 routing |
40
+ | Context Length | 32,768 tokens |
41
+ | Generation Speed | 154 t/s · H200 |
42
+ | Prompt Processing | 878 t/s · H200 |
43
+ | Quantization | Q5_K_M · 5.69 BPW |
44
+ | File Size | 33.2 GB GGUF |
45
+ | License | Apache 2.0 |
46
+
47
+ ---
48
+
49
+ ## Capabilities
50
 
51
+ - Instruction following — multi-turn conversational coherence
52
+ - ✅ Code generation — correct, edge-case-aware output
53
+ - ✅ Creative writing — long-form prose and poetry
54
+ - ✅ Factual reasoning — physics, mathematics, general knowledge
55
+ - ✅ Consumer-grade deployment — fits accessible GPU budgets at Q5_K_M
56
 
57
+ > Formal benchmark results (MMLU, HellaSwag, ARC-Challenge, GSM8K) in progress.
58
+
59
+ ---
 
60
 
61
+ ## About the Approach
62
 
63
+ Klyrone's MoE assembly framework constructs high-performance models by composing expert sub-networks from compatible source models without full retraining. The approach preserves routing coherence while inheriting specialized capabilities from donor models.
64
+
65
+ Full methodology will be published via arXiv. For enterprise licensing or research collaboration, contact **research@klyrone.com**
66
+
67
+ ---
68
 
69
  ## Usage
70
 
71
+ ### llama.cpp
72
+
73
  ```bash
74
+ ./llama-cli \
75
+ -m Chimera-47B-Q5_K_M.gguf \
76
+ -p "You are a helpful assistant." \
77
+ --ctx-size 32768 \
78
+ -n 512
79
  ```
80
 
81
+ ### Transformers
82
+
83
+ ```python
84
+ from transformers import AutoModelForCausalLM, AutoTokenizer
85
+
86
+ model = AutoModelForCausalLM.from_pretrained(
87
+ "klyrone/Chimera",
88
+ device_map="auto",
89
+ torch_dtype="auto"
90
+ )
91
+ tokenizer = AutoTokenizer.from_pretrained("klyrone/Chimera")
92
+ ```
93
+
94
+ ---
95
+
96
+ ## Limitations
97
 
98
+ - Router fine-tuning not yet applied a short gate re-alignment is expected to yield marginal quality gains
99
+ - No independent safety evaluation conducted — not recommended for unsupervised public-facing deployment
100
+ - Benchmark results pending publication
101
 
102
+ ---
103
+
104
+ ## Citation
105
+
106
+ ```bibtex
107
+ @misc{chimera47b2026,
108
+ title = {Chimera 47B},
109
+ author = {{Klyrone F.Z.E.}},
110
+ year = {2026},
111
+ howpublished = {\url{https://huggingface.co/klyrone/Chimera}}
112
+ }
113
+ ```
114
+
115
+ ---
116
 
117
+ *Chimera 47B · Klyrone F.Z.E. · Apache 2.0 · A technical paper on the AoE technique is forthcoming.*