Qwen2-Audio Selected-Layer Codec Compression (q2 + Zstd)
This repository contains a custom compressed artifact derived from Qwen/Qwen2-Audio-7B-Instruct.
Method
Selected-layer codec compression over MLP/feed-forward style linear layers:
mlpfeed_forwardup_projdown_projgate_proj
Compression setting:
- Quantization: q2
- Compression: Zstd
Files
compressed_model.pt: serialized PyTorch model with customCompressedLinearmodulescompression_metadata.json: metadata describing the compression setup
Important
This is not a standard Hugging Face from_pretrained() checkpoint.
Loading this artifact requires the custom Python code defining the CompressedLinear module and codec pipeline used during export.
Base model
Qwen/Qwen2-Audio-7B-Instruct
Notes
This artifact was produced for compression experiments on English -> Chinese speech translation.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support