shibatch commited on
Commit
6ec8a17
·
verified ·
1 Parent(s): ad0e5ec

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,7 +17,7 @@ This repository provides an ultra-lightweight Mixtral model variant (a Mixture-o
17
 
18
  Following extensive long-context scaling evaluations, this asset has been calibrated to a **4,096 token context window (4k)** with an adjusted **RoPE base frequency (`rope_theta`) of 15,000.0** to prevent numerical saturation under FP32 precision boundaries while maintaining sharp localized attention coordinates.
19
 
20
- It is designed specifically for debugging custom inference engines (such as `vulformer`), and native tensor compilers against MoE-specific runtime features. These include Gating network weight allocation, token distribution/gathering (Scatter/Gather loops), and the weighted addition combining multiple independent expert outputs.
21
 
22
  ---
23
 
 
17
 
18
  Following extensive long-context scaling evaluations, this asset has been calibrated to a **4,096 token context window (4k)** with an adjusted **RoPE base frequency (`rope_theta`) of 15,000.0** to prevent numerical saturation under FP32 precision boundaries while maintaining sharp localized attention coordinates.
19
 
20
+ It is designed specifically for debugging custom inference engines, and native tensor compilers against MoE-specific runtime features. These include Gating network weight allocation, token distribution/gathering (Scatter/Gather loops), and the weighted addition combining multiple independent expert outputs.
21
 
22
  ---
23