npuw-synth-mha-llama-12L
Synthetic LLM test model โ MHA variant (num_kv_heads == num_heads). Companion to
npuw-gqa-test-model for
isolating GQA-specific code paths on NPU.
Random weights โ not for inference quality.
Config
| field | value |
|---|---|
| layers | 12 |
| hidden_size | 256 |
| num_heads | 8 |
| num_kv_heads | 8 (MHA) |
| head_dim | 32 |
| intermediate_size | 1024 |
| vocab_size | 32000 |
| ffn | swiglu |
| norm | rms |
| rope | half |
| weight | fp32 |
| position_ids | 2d |
| kv_cache | yes |
Source
Generated with npuw_model_generator_demo from
dylanneve1/openvino@gqa-fix:
npuw_model_generator_demo --type llm \
-t Meta-Llama-3.1-8B-Instruct \
-o out -n npuw-synth-mha-llama-12L \
--num-layers 12
- Downloads last month
- 18
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support