Charlie81
/

LoRE

Charlie81 commited on Jul 12, 2025

Commit

2b4d259

1 Parent(s): e785830

change to constant strategy

Files changed (1) hide show

myolmoe/modeling_myolmoe.py CHANGED Viewed

@@ -65,8 +65,8 @@ class OlmoeConfig(PretrainedConfig):
         small_expert_intermediate_ratio=64,
         small_expert_count=64,
         small_expert_sparsity_coef=0.1,
-        small_expert_strategy="increment",  # NEW
-        max_small_expert_count=64,         # NEW: total possible small experts
         **kwargs,
     ):
         self.vocab_size = vocab_size

         small_expert_intermediate_ratio=64,
         small_expert_count=64,
         small_expert_sparsity_coef=0.1,
+        small_expert_strategy="constant",  # increment
+        max_small_expert_count=64,
         **kwargs,
     ):
         self.vocab_size = vocab_size