npuw-gqa-test-model
Synthetic LLM test model for NPUW (Intel NPU plugin) GQA path validation. Random weights — not for inference quality, only for plugin pipeline testing.
Config
| field | value |
|---|---|
| layers | 12 |
| hidden_size | 256 |
| num_heads | 8 |
| num_kv_heads | 2 (GQA, n_rep=4) |
| head_dim | 32 |
| intermediate_size | 1024 |
| vocab_size | 32000 |
| ffn | swiglu |
| norm | rms |
| rope | half |
| weight | fp32 |
| position_ids | 2d |
| kv_cache | yes |
| tokenizer | Meta-Llama-3.1 |
Validated backends
CPU, NPU (NPUW plain), NPU (HFA), NPU (Pyramid). All four compile + run.
Source
Generated with npuw_model_generator_demo from
dylanneve1/openvino@gqa-fix:
npuw_model_generator_demo --type llm \
-t Meta-Llama-3.1-8B-Instruct \
-o out -n npuw-synth-gqa-llama-12L \
--num-kv-heads 2 --num-layers 12
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support