Deep5201 commited on
Commit
b0bc912
·
verified ·
1 Parent(s): 4998f9e

Add bf16 models for Gemma3 representative blocks

Browse files

- `embed_scale.onnx`: Scaling of already-looked-up token embeddings (embedding lookup itself is external).
- `layer_block_0.onnx`: Representative full decoder block: input/pre-attention RMSNorm, self-attention, post-attention RMSNorm, first residual add, pre-feedforward RMSNorm, feedforward/MLP, post-feedforward RMSNorm, and final residual add. The full model has 18 such blocks.
- `final_norm.onnx`: Final RMSNorm layer before output projection.
- `lm_head.onnx`: Output projection to logits via large MatMul.

Files changed (4) hide show
  1. embed_scale.onnx +3 -0
  2. final_norm.onnx +3 -0
  3. layer_block_0.onnx +3 -0
  4. lm_head.onnx +3 -0
embed_scale.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fdccabff02e1c611b5aae92cf936d81a4f7685445837d0e66526ccdd671f3a27
3
+ size 492
final_norm.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae38665f3d0c99d8c91ee394a5c92b456d0f54d44b0e863747eebddf1eba6d1e
3
+ size 2727
layer_block_0.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f948f80272fdb7f3cb085d307926f2abf1c459ee3d0e4f39e6591eb7d3b4d2fb
3
+ size 11177991
lm_head.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25d652bcdd5dc75962e432606de879cdc44e67dc4b3ad36fb3ddff5245139f60
3
+ size 335544758