Upload README.md with huggingface_hub

bd6a26b verified 29 days ago

1.05 kB

library_name: pytorch
tags:
  - audio
  - speech-translation
  - model-compression
  - custom-compression
license: other

Qwen2-Audio Selected-Layer Codec Compression (q2 + Zstd)

This repository contains a custom compressed artifact derived from Qwen/Qwen2-Audio-7B-Instruct.

Method

Selected-layer codec compression over MLP/feed-forward style linear layers:

mlp
feed_forward
up_proj
down_proj
gate_proj

Compression setting:

Quantization: q2
Compression: Zstd

Files

compressed_model.pt: serialized PyTorch model with custom CompressedLinear modules
compression_metadata.json: metadata describing the compression setup

Important

This is not a standard Hugging Face from_pretrained() checkpoint. Loading this artifact requires the custom Python code defining the CompressedLinear module and codec pipeline used during export.

Base model

Qwen/Qwen2-Audio-7B-Instruct

Notes

This artifact was produced for compression experiments on English -> Chinese speech translation.