SongGeneration-v2-large

ailuntz commited on 3 days ago

Commit

2ef117f

verified ·

1 Parent(s): 5e51f1e

Update default README.md

Browse files

Files changed (1) hide show

README.md +9 -6

README.md CHANGED Viewed

@@ -14,10 +14,12 @@ tags:
 Part of the SongGeneration MLX conversion set. Collection: https://huggingface.co/collections/mlx-community/songgeneration-v2-mlx-6a1bf9342dd0806419737229
-# SongGeneration-v2-large-bf16
 Apple MLX weights for the autoregressive `audiolm` token generator from Tencent SongGeneration v2-large.
 This is not a full-stack pure MLX audio pipeline yet: token generation runs with MLX, while FLAC decoding currently uses the official PyTorch Flow1dVAE / separate-tokenizer bridge in [`ailuntx/SongGeneration-MLX`](https://github.com/ailuntx/SongGeneration-MLX).
 ## TL;DR
@@ -40,10 +42,10 @@ python -m venv .venv
 .venv/bin/pip install -e .
 .venv/bin/pip install -U huggingface_hub hf_transfer
-HF_HUB_ENABLE_HF_TRANSFER=1 .venv/bin/hf download mlx-community/SongGeneration-v2-large-bf16 --local-dir ./models/SongGeneration-v2-large-bf16
 .venv/bin/python -m songgeneration_mlx.cli \
-  --model ./models/SongGeneration-v2-large-bf16 \
   --lyrics '[verse] hello from mlx [chorus] sing it again' \
   --description 'Pop, female vocal, bright production, [Musicality-medium].' \
   --duration 2 \
@@ -70,7 +72,7 @@ HF_HUB_ENABLE_HF_TRANSFER=1 .venv/bin/hf download tencent/SongGeneration \
 PYTORCH_ENABLE_MPS_FALLBACK=1 SONGGEN_DEVICE=mps \
 .venv-decoder/bin/python scripts/decode_tokens_official.py \
-  --mlx-model ./models/SongGeneration-v2-large-bf16 \
   --tokens ./tokens_2s.npz \
   --output ./output_2s.flac \
   --device mps
@@ -85,7 +87,8 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 SONGGEN_DEVICE=mps \
 | `SongGeneration-v2-medium-8bit` | 2.8G | smaller medium checkpoint |
 | `SongGeneration-v2-medium-4bit` | 1.5G | smallest medium checkpoint |
 | `SongGeneration-v2-large-fp32` | 19G | high-precision large baseline |
-| `SongGeneration-v2-large-bf16` | 9.5G | large bf16 quality baseline |
 | `SongGeneration-v2-large-8bit` | 5.0G | smaller large checkpoint |
 | `SongGeneration-v2-large-4bit` | 2.7G | smallest large checkpoint |
 | `SongGeneration-v2-fast-*` | pending | upstream fast weights were not publicly available when checked on 2026-05-31 |
@@ -93,7 +96,7 @@ PYTORCH_ENABLE_MPS_FALLBACK=1 SONGGEN_DEVICE=mps \
 ## Layout
 ```text
-SongGeneration-v2-large-bf16/
 |-- model-00001-of-000xx.safetensors
 |-- model.safetensors.index.json
 |-- config.json

 Part of the SongGeneration MLX conversion set. Collection: https://huggingface.co/collections/mlx-community/songgeneration-v2-mlx-6a1bf9342dd0806419737229
+# SongGeneration-v2-large
 Apple MLX weights for the autoregressive `audiolm` token generator from Tencent SongGeneration v2-large.
+This default repository is the bf16 variant and is equivalent to [`mlx-community/SongGeneration-v2-large-bf16`](https://huggingface.co/mlx-community/SongGeneration-v2-large-bf16). Use this repo when you want the recommended default checkpoint.
 This is not a full-stack pure MLX audio pipeline yet: token generation runs with MLX, while FLAC decoding currently uses the official PyTorch Flow1dVAE / separate-tokenizer bridge in [`ailuntx/SongGeneration-MLX`](https://github.com/ailuntx/SongGeneration-MLX).
 ## TL;DR
 .venv/bin/pip install -e .
 .venv/bin/pip install -U huggingface_hub hf_transfer
+HF_HUB_ENABLE_HF_TRANSFER=1 .venv/bin/hf download mlx-community/SongGeneration-v2-large --local-dir ./models/SongGeneration-v2-large
 .venv/bin/python -m songgeneration_mlx.cli \
+  --model ./models/SongGeneration-v2-large \
   --lyrics '[verse] hello from mlx [chorus] sing it again' \
   --description 'Pop, female vocal, bright production, [Musicality-medium].' \
   --duration 2 \
 PYTORCH_ENABLE_MPS_FALLBACK=1 SONGGEN_DEVICE=mps \
 .venv-decoder/bin/python scripts/decode_tokens_official.py \
+  --mlx-model ./models/SongGeneration-v2-large \
   --tokens ./tokens_2s.npz \
   --output ./output_2s.flac \
   --device mps
 | `SongGeneration-v2-medium-8bit` | 2.8G | smaller medium checkpoint |
 | `SongGeneration-v2-medium-4bit` | 1.5G | smallest medium checkpoint |
 | `SongGeneration-v2-large-fp32` | 19G | high-precision large baseline |
+| `SongGeneration-v2-large` | 9.5G | default large checkpoint, same weights as bf16 |
+| `SongGeneration-v2-large-bf16` | 9.5G | explicit large bf16 quality baseline |
 | `SongGeneration-v2-large-8bit` | 5.0G | smaller large checkpoint |
 | `SongGeneration-v2-large-4bit` | 2.7G | smallest large checkpoint |
 | `SongGeneration-v2-fast-*` | pending | upstream fast weights were not publicly available when checked on 2026-05-31 |
 ## Layout
 ```text
+SongGeneration-v2-large/
 |-- model-00001-of-000xx.safetensors
 |-- model.safetensors.index.json
 |-- config.json