vikramp commited on
Commit
9497137
·
verified ·
1 Parent(s): 5e7dc60

Upload 12 files

Browse files

Model:

Tiny_MoE_Config(
name='Tiny_MoE',
dtype=jax.numpy.bfloat16,
param_dtype=jax.numpy.float32,
block_size=2048,
vocab_size=49152,
n_layer=30,
n_head=9,
n_kv_head=3,
n_embed=576,
n_experts=8,
mesh=None,
top_k=2,
load_factor=10.0,
expert_weight_priority=False,
load_balance_loss_coeff=0.01,
z_loss_coeff=0.0001,
n_mlp_hidden=1536,
mlp_bias=False,
attention_bias=False,
moe_bias=False,
ln_epsilon=1e-05,
glu_activation='silu',
sdpa_implementation='slow',
rope_theta=0.0001,
init_stddev=0.02,
use_cache=False,
glu_fc_kernel_sharding=(None,),
glu_fc_bias_sharding=(None,),
glu_gate_kernel_sharding=(None,),
glu_gate_bias_sharding=(None,),
glu_proj_kernel_sharding=(None,),
glu_proj_bias_sharding=(None,),
attn_wq_kernel_sharding=(None,),
attn_wq_bias_sharding=(None,),
attn_wkv_kernel_sharding=(None,),
attn_wkv_bias_sharding=(None,),
attn_wproj_kernel_sharding=(None,),
attn_wproj_bias_sharding=(None,),
embed_partition_spec=(None,),
rmsnorm_partition_spec=(None,),
)
Parameter Count: 413,275,968
MOE (Sharded) Parameter Count: 318,504,960
Replicated Parameter Count: 94,771,008
Active Parameter Count: 174,397,248.0
% Active Parameters: 42.20


Dataset:

https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus

run_20250826_isander_dingo/checkpoint-61796.pt/_CHECKPOINT_METADATA ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:faa231a4b7f2bfde63b1bdf6b6407a97eaf178ada149cc5913f276c65058d41d
3
+ size 262
run_20250826_isander_dingo/checkpoint-61796.pt/_METADATA ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98aebf8504f85b582606151da8cd48ad62a66f3297649fb0ed755b85d9941005
3
+ size 90611
run_20250826_isander_dingo/checkpoint-61796.pt/_sharding ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0791ae7bc69eb4a767aeebb1afa4b5c9a444d972a39459de906df07c19848b60
3
+ size 74440
run_20250826_isander_dingo/checkpoint-61796.pt/array_metadatas/process_0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d5edad4ecd6176cb98a973dabf0aa6ea404656cd5ddadeaf56d46b2e3a45dc46
3
+ size 36120
run_20250826_isander_dingo/checkpoint-61796.pt/d/c56ed386b7a53157c0dc7189c0af1e35 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5353971c881a5fc7b3c3a219bbe78880c55750bbc34670f19b5b8bc33ac49178
3
+ size 139227
run_20250826_isander_dingo/checkpoint-61796.pt/manifest.ocdbt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f5e131205c37e0fd4ee235ce1ec4b118397153599f49e66f5cfcaa6b05b5600
3
+ size 120
run_20250826_isander_dingo/checkpoint-61796.pt/ocdbt.process_0/d/8498c3a155842619dd4839be55ccbec0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98167ddeb42c81a1f5ca27e24a28b775b1cbf9d360bb7240fbca4b02eabd56ec
3
+ size 774043074
run_20250826_isander_dingo/checkpoint-61796.pt/ocdbt.process_0/d/95cf84baf91161a195594ef8368ff61b ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3cbe80ea3e43f138222cf8f78630f3304e28b4aaf9fa98f3f51cd3df1c9881c1
3
+ size 791
run_20250826_isander_dingo/checkpoint-61796.pt/ocdbt.process_0/d/c58e0b3b76eb0acd14c081b50d7cda41 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5fcf0be9a8a51755d1975c2ef265dc447a879ed7b82ba54427ce86b519226f11
3
+ size 754954129
run_20250826_isander_dingo/checkpoint-61796.pt/ocdbt.process_0/d/c5aa49e6919e24cd3565479bb5c7dfa4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:88097147aee865f0b17530f3203e1db7f8c1577494ea7943b9bb9c6413f5fab7
3
+ size 29155
run_20250826_isander_dingo/checkpoint-61796.pt/ocdbt.process_0/d/f331034c9eefdb605002a7430f33878a ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:90f96f9b66dc77141bebeb6b277d8e8bd5ddc16bde0400dc28251d8dd8e86e8b
3
+ size 200
run_20250826_isander_dingo/checkpoint-61796.pt/ocdbt.process_0/manifest.ocdbt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb538ed60d39f06271ca07625fceae9544c8549a0596448e6a8c67f7e976ce50
3
+ size 323