zhangj1an
/

audiox_random

Model card Files Files and versions

audiox_random / README.md

zhangj1an's picture

Upload folder using huggingface_hub

b79aa87 verified 20 days ago

|

history blame contribute delete

1.22 kB

	---
	license: apache-2.0
	tags:
	- vllm-omni
	- audiox
	- test-fixture
	---

	# AudioX random / test fixture

	A tiny random-init bundle of [vLLM-Omni](https://github.com/vllm-project/vllm-omni)'s
	`AudioXPipeline`. Used by the L1/L2 `core_model` CI tests
	(`tests/e2e/offline_inference/test_audiox_model.py`,
	`tests/e2e/online_serving/test_audiox_online.py`) so they can verify the full
	pipeline (load → forward → trim → return numpy WAV) end-to-end without paying
	the cost of the real ~11 GB checkpoint.

	It follows the same `config.json` schema as
	[`zhangj1an/AudioX`](https://huggingface.co/zhangj1an/AudioX), but with much
	smaller transformer dimensions:

	- `embed_dim`: 1536 → 384
	- `depth`: 24 → 4
	- `num_heads`: 24 → 6
	- `gate_type_config.num_experts_per_modality`: 64 → 16
	- `gate_type_config.num_fusion_layers`: 8 → 2
	- `sample_size`: 485100 → 483328 (still gives `latent_len = sample_size // 2048 = 236`,
	matching the transformer's RoPE precompute)

	All weights are random, fp16, generated by running the `AudioXPipeline.__init__`
	with the small config and dumping its `state_dict()` with the bundle's legacy
	naming convention. Do not use for actual generation — outputs are noise.