npuw-synth-mha-llama-12L

Synthetic LLM test model โ€” MHA variant (num_kv_heads == num_heads). Companion to npuw-gqa-test-model for isolating GQA-specific code paths on NPU.

Random weights โ€” not for inference quality.

Config

field value
layers 12
hidden_size 256
num_heads 8
num_kv_heads 8 (MHA)
head_dim 32
intermediate_size 1024
vocab_size 32000
ffn swiglu
norm rms
rope half
weight fp32
position_ids 2d
kv_cache yes

Source

Generated with npuw_model_generator_demo from dylanneve1/openvino@gqa-fix:

npuw_model_generator_demo --type llm \
  -t Meta-Llama-3.1-8B-Instruct \
  -o out -n npuw-synth-mha-llama-12L \
  --num-layers 12
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support