KronosLabs commited on
Commit
c2b5b6a
·
verified ·
1 Parent(s): 4bddfeb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -13,8 +13,12 @@ It was trained on closed source datasets containing synthetically generated and
13
  The model contains the following layout:
14
 
15
  Size: 80B Parameters, A3B (3 Billion Active)
 
16
  Depth: 48 layers
 
17
  Hybrid layout: 12 * (3 * (Gated DeltaNet -> MoE) -> 1 * (Gated Attention -> MoE))
 
18
  MoE: 512 experts, 10 activated experts, 1 shared expert
 
19
  Context Length: 262,144 native
20
 
 
13
  The model contains the following layout:
14
 
15
  Size: 80B Parameters, A3B (3 Billion Active)
16
+
17
  Depth: 48 layers
18
+
19
  Hybrid layout: 12 * (3 * (Gated DeltaNet -> MoE) -> 1 * (Gated Attention -> MoE))
20
+
21
  MoE: 512 experts, 10 activated experts, 1 shared expert
22
+
23
  Context Length: 262,144 native
24