--- license: mit tags: - audio - audio-super-resolution - upscaling - audiosr - safetensors - maestraea pipeline_tag: audio-to-audio --- # AudioSR Models (Safetensors) **Audio Super-Resolution — Upscale Any Audio to 48kHz** [Original Source](https://github.com/haoheliu/versatile_audio_super_resolution) by [Haohe Liu](https://github.com/haoheliu) · MIT License > Converted from `pytorch_model.bin` to safetensors format for faster loading and safer deserialization. For use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea). ## Available Models | Variant | Files | Size | Description | |---------|-------|------|-------------| | **basic** | `basic/audiosr_basic.safetensors` | 6.2 GB | General audio (music, SFX, speech) | | **speech** | `speech/audiosr_speech-*.safetensors` (3 shards) | 6.2 GB | Optimized for spoken word | ## What AudioSR Does AudioSR uses latent diffusion to upscale any audio to 48kHz, restoring high-frequency content that was lost to: - Low sample rate recording (8kHz, 16kHz, 22kHz → 48kHz) - Lossy compression (MP3, AAC artifacts) - Bandwidth-limited audio ### Key Parameters | Parameter | Range | Default | Description | |-----------|-------|---------|-------------| | `ddim_steps` | 10–200 | 50 | More steps = higher quality | | `guidance_scale` | 1–10 | 3.5 | Prompt adherence | | `model_name` | basic/speech | basic | Which variant to use | ### VRAM Requirements - **Minimum**: ~4 GB - **Recommended**: ~6 GB (for longer audio) ## Usage with Mæstræa These models are automatically downloaded by the Mæstræa AI Workstation backend. ### Direct Usage ```python import audiosr model = audiosr.build_model(model_name="basic") waveform = audiosr.super_resolution( model, "input.wav", seed=42, guidance_scale=3.5, ddim_steps=50 ) ``` ## Original Source | Variant | Original Repo | |---------|--------------| | basic | [haoheliu/audiosr_basic](https://huggingface.co/haoheliu/audiosr_basic) | | speech | [haoheliu/audiosr_speech](https://huggingface.co/haoheliu/audiosr_speech) | ## License MIT — same as the original AudioSR release. ## Credits - **Model**: [AudioSR](https://github.com/haoheliu/versatile_audio_super_resolution) by Haohe Liu et al. - **Paper**: [Versatile Audio Super Resolution](https://arxiv.org/abs/2309.07314) - **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)