Qwen2-Audio Selected-Layer Codec Compression (q3 + Zstd)
This repository contains a custom compressed artifact derived from Qwen/Qwen2-Audio-7B-Instruct.
Method
Selected-layer codec compression over MLP/feed-forward style linear layers:
mlpfeed_forwardup_projdown_projgate_proj
Compression setting:
- Quantization: q3
- Compression: Zstd
Files
compressed_model.ptcompression_metadata.json
Important
This is not a standard Hugging Face from_pretrained() checkpoint.
Loading requires the custom CompressedLinear module and codec pipeline used during export.
Base model
Qwen/Qwen2-Audio-7B-Instruct
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support