Add bf16 models for Gemma3 representative blocks
Browse files- `embed_scale.onnx`: Scaling of already-looked-up token embeddings (embedding lookup itself is external).
- `layer_block_0.onnx`: Representative full decoder block: input/pre-attention RMSNorm, self-attention, post-attention RMSNorm, first residual add, pre-feedforward RMSNorm, feedforward/MLP, post-feedforward RMSNorm, and final residual add. The full model has 18 such blocks.
- `final_norm.onnx`: Final RMSNorm layer before output projection.
- `lm_head.onnx`: Output projection to logits via large MatMul.
- embed_scale.onnx +3 -0
- final_norm.onnx +3 -0
- layer_block_0.onnx +3 -0
- lm_head.onnx +3 -0
embed_scale.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fdccabff02e1c611b5aae92cf936d81a4f7685445837d0e66526ccdd671f3a27
|
| 3 |
+
size 492
|
final_norm.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ae38665f3d0c99d8c91ee394a5c92b456d0f54d44b0e863747eebddf1eba6d1e
|
| 3 |
+
size 2727
|
layer_block_0.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f948f80272fdb7f3cb085d307926f2abf1c459ee3d0e4f39e6591eb7d3b4d2fb
|
| 3 |
+
size 11177991
|
lm_head.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:25d652bcdd5dc75962e432606de879cdc44e67dc4b3ad36fb3ddff5245139f60
|
| 3 |
+
size 335544758
|