npuw-gqa-test-model

Synthetic LLM test model for NPUW (Intel NPU plugin) GQA path validation. Random weights — not for inference quality, only for plugin pipeline testing.

Config

field value
layers 12
hidden_size 256
num_heads 8
num_kv_heads 2 (GQA, n_rep=4)
head_dim 32
intermediate_size 1024
vocab_size 32000
ffn swiglu
norm rms
rope half
weight fp32
position_ids 2d
kv_cache yes
tokenizer Meta-Llama-3.1

Validated backends

CPU, NPU (NPUW plain), NPU (HFA), NPU (Pyramid). All four compile + run.

Source

Generated with npuw_model_generator_demo from dylanneve1/openvino@gqa-fix:

npuw_model_generator_demo --type llm \
  -t Meta-Llama-3.1-8B-Instruct \
  -o out -n npuw-synth-gqa-llama-12L \
  --num-kv-heads 2 --num-layers 12
Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support