sarel commited on
Commit
4860051
·
verified ·
1 Parent(s): 77936e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -16,6 +16,8 @@ base_model: nvidia/nemotron-3-nano-30b-base
16
  pipeline_tag: text-generation
17
  ---
18
 
 
 
19
  # 🛡️ HEBATRON: Hebrew-Specialized Mamba2-MoE
20
 
21
  HEBATRON is a state-of-the-art, high-performance language model specialized for the Hebrew language. Developed through a collaboration between **PwC Israel**, **MAFAT**, and **AWS**, it introduces a unique hybrid architecture combining **Mamba2** and **Mixture-of-Experts (MoE)**.
@@ -33,7 +35,7 @@ HEBATRON is designed to handle the structural and morphological complexities of
33
  | **Architecture** | Hybrid Mamba2 (SSM) + Sparse MoE |
34
  | **Total Parameters** | 31.6B |
35
  | **Active Parameters** | ~3B per token |
36
- | **Context Window** | 65,536 (64k) tokens |
37
  | **Hardware** | NVIDIA Blackwell (B300) & H200 GPUs |
38
  | **Precision** | FP8 Mixed-Precision |
39
 
 
16
  pipeline_tag: text-generation
17
  ---
18
 
19
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/60a75f5523ce37179774a20b/hJ8gE5j7w3Frnf4xgGcjq.png)
20
+
21
  # 🛡️ HEBATRON: Hebrew-Specialized Mamba2-MoE
22
 
23
  HEBATRON is a state-of-the-art, high-performance language model specialized for the Hebrew language. Developed through a collaboration between **PwC Israel**, **MAFAT**, and **AWS**, it introduces a unique hybrid architecture combining **Mamba2** and **Mixture-of-Experts (MoE)**.
 
35
  | **Architecture** | Hybrid Mamba2 (SSM) + Sparse MoE |
36
  | **Total Parameters** | 31.6B |
37
  | **Active Parameters** | ~3B per token |
38
+ | **Context Window** | 8096 tokens |
39
  | **Hardware** | NVIDIA Blackwell (B300) & H200 GPUs |
40
  | **Precision** | FP8 Mixed-Precision |
41