kk497055 commited on
Commit
a33bdc5
·
verified ·
1 Parent(s): f744e49

Add model card

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - chimera
5
+ - moe
6
+ - mixture-of-experts
7
+ - gguf
8
+ - klyrone
9
+ language:
10
+ - en
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ # Chimera 8x7B
15
+
16
+ **Chimera** is a Mixture-of-Experts language model developed by [Klyrone Tech](https://huggingface.co/kk497055), built using our proprietary **Amalgamation of Experts (AoE)** technique. Chimera features 8 specialized expert networks with top-2 routing, delivering strong instruction-following and reasoning capabilities with an efficient 12.9B active parameter footprint.
17
+
18
+ ## Model Details
19
+
20
+ | | |
21
+ |---|---|
22
+ | **Architecture** | Mixture of Experts (MoE) - 8 experts, top-2 routing |
23
+ | **Total Parameters** | 46.7B |
24
+ | **Active Parameters** | 12.9B per token |
25
+ | **Context Length** | 32,768 tokens |
26
+ | **Quantization** | Q5_K_M (GGUF) |
27
+ | **Developed by** | Klyrone Tech |
28
+
29
+ ## Key Features
30
+
31
+ - **Efficient MoE Architecture** - Only 12.9B parameters active per forward pass despite 46.7B total, enabling fast inference
32
+ - **Specialized Expert Networks** - 8 expert FFN modules with learned routing for task-adaptive computation
33
+ - **Instruction-Tuned Experts** - Expert networks optimized for instruction following, code generation, and reasoning
34
+ - **Long Context** - Supports up to 32K token context windows with RoPE positional encoding
35
+
36
+ ## Amalgamation of Experts (AoE)
37
+
38
+ Chimera is built using our **AoE** technique, a novel approach to constructing high-quality MoE models by strategically assembling expert networks. AoE enables the creation of models that combine specialized capabilities from multiple training paradigms into a unified, coherent architecture.
39
+
40
+ ## Usage
41
+
42
+ ### With llama.cpp
43
+ ```bash
44
+ ./llama-cli -m Chimera-8x7B-Q5_K_M.gguf -p "Your prompt here" -n 500 -ngl 99
45
+ ```
46
+
47
+ ## Intended Use
48
+
49
+ Chimera is designed for general-purpose text generation including conversational AI, code generation, reasoning, and instruction following.
50
+
51
+ ## License
52
+
53
+ Apache 2.0