audiox_random / README.md
zhangj1an's picture
Upload folder using huggingface_hub
b79aa87 verified
---
license: apache-2.0
tags:
- vllm-omni
- audiox
- test-fixture
---
# AudioX random / test fixture
A tiny **random-init** bundle of [vLLM-Omni](https://github.com/vllm-project/vllm-omni)'s
`AudioXPipeline`. Used by the L1/L2 `core_model` CI tests
(`tests/e2e/offline_inference/test_audiox_model.py`,
`tests/e2e/online_serving/test_audiox_online.py`) so they can verify the full
pipeline (load β†’ forward β†’ trim β†’ return numpy WAV) end-to-end without paying
the cost of the real ~11 GB checkpoint.
It follows the same `config.json` schema as
[`zhangj1an/AudioX`](https://huggingface.co/zhangj1an/AudioX), but with much
smaller transformer dimensions:
- `embed_dim`: 1536 β†’ 384
- `depth`: 24 β†’ 4
- `num_heads`: 24 β†’ 6
- `gate_type_config.num_experts_per_modality`: 64 β†’ 16
- `gate_type_config.num_fusion_layers`: 8 β†’ 2
- `sample_size`: 485100 β†’ 483328 (still gives `latent_len = sample_size // 2048 = 236`,
matching the transformer's RoPE precompute)
All weights are random, fp16, generated by running the `AudioXPipeline.__init__`
with the small config and dumping its `state_dict()` with the bundle's legacy
naming convention. **Do not use for actual generation** β€” outputs are noise.