npuw-gqa-test-model

Synthetic LLM test model for NPUW (Intel NPU plugin) GQA path validation. Random weights — not for inference quality, only for plugin pipeline testing.

Config

field	value
layers	12
hidden_size	256
num_heads	8
num_kv_heads	2 (GQA, n_rep=4)
head_dim	32
intermediate_size	1024
vocab_size	32000
ffn	swiglu
norm	rms
rope	half
weight	fp32
position_ids	2d
kv_cache	yes
tokenizer	Meta-Llama-3.1

Validated backends

CPU, NPU (NPUW plain), NPU (HFA), NPU (Pyramid). All four compile + run.

Source

Generated with npuw_model_generator_demo from dylanneve1/openvino@gqa-fix:

npuw_model_generator_demo --type llm \
  -t Meta-Llama-3.1-8B-Instruct \
  -o out -n npuw-synth-gqa-llama-12L \
  --num-kv-heads 2 --num-layers 12

Downloads last month: 8

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support