Add bf16 models for Gemma3 representative blocks

- `embed_scale.onnx`: Scaling of already-looked-up token embeddings (embedding lookup itself is external).
- `layer_block_0.onnx`: Representative full decoder block: input/pre-attention RMSNorm, self-attention, post-attention RMSNorm, first residual add, pre-feedforward RMSNorm, feedforward/MLP, post-feedforward RMSNorm, and final residual add. The full model has 18 such blocks.
- `final_norm.onnx`: Final RMSNorm layer before output projection.
- `lm_head.onnx`: Output projection to logits via large MatMul.

Files changed (4) hide show

embed_scale.onnx +3 -0
final_norm.onnx +3 -0
layer_block_0.onnx +3 -0
lm_head.onnx +3 -0

embed_scale.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fdccabff02e1c611b5aae92cf936d81a4f7685445837d0e66526ccdd671f3a27
+size 492

final_norm.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ae38665f3d0c99d8c91ee394a5c92b456d0f54d44b0e863747eebddf1eba6d1e
+size 2727

layer_block_0.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f948f80272fdb7f3cb085d307926f2abf1c459ee3d0e4f39e6591eb7d3b4d2fb
+size 11177991

lm_head.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:25d652bcdd5dc75962e432606de879cdc44e67dc4b3ad36fb3ddff5245139f60
+size 335544758