Update model card: fix script paths, add license section
Browse files
README.md
CHANGED
|
@@ -61,16 +61,7 @@ This bundle is self-contained — all weights are packaged in one repository.
|
|
| 61 |
|
| 62 |
## How to Get Started
|
| 63 |
|
| 64 |
-
|
| 65 |
-
the repo (the Step Audio CLI entry point is not yet in the unified public API).
|
| 66 |
-
|
| 67 |
-
```bash
|
| 68 |
-
pip install mlx-speech
|
| 69 |
-
git clone https://github.com/appautomaton/mlx-speech.git
|
| 70 |
-
cd mlx-speech
|
| 71 |
-
```
|
| 72 |
-
|
| 73 |
-
Download the bundle with `huggingface-cli`:
|
| 74 |
|
| 75 |
```bash
|
| 76 |
hf download appautomaton/step-audio-editx-8bit-mlx \
|
|
@@ -80,9 +71,7 @@ hf download appautomaton/step-audio-editx-8bit-mlx \
|
|
| 80 |
**Voice cloning:**
|
| 81 |
|
| 82 |
```bash
|
| 83 |
-
python scripts/
|
| 84 |
-
--model-dir models/stepfun/step_audio_editx/mlx-int8 \
|
| 85 |
-
--prefer-mlx-int8 \
|
| 86 |
--prompt-audio reference.wav \
|
| 87 |
--prompt-text "Transcript of reference audio." \
|
| 88 |
-o cloned.wav \
|
|
@@ -92,9 +81,7 @@ python scripts/generate_step_audio_editx.py \
|
|
| 92 |
**Audio editing (change emotion):**
|
| 93 |
|
| 94 |
```bash
|
| 95 |
-
python scripts/
|
| 96 |
-
--model-dir models/stepfun/step_audio_editx/mlx-int8 \
|
| 97 |
-
--prefer-mlx-int8 \
|
| 98 |
--prompt-audio input.wav \
|
| 99 |
--prompt-text "Transcript of input audio." \
|
| 100 |
-o happy.wav \
|
|
@@ -136,3 +123,8 @@ On Apple Silicon with int8 weights and bf16 activations, real-time factor
|
|
| 136 |
- Upstream model: [`stepfun-ai/Step-Audio-EditX`](https://huggingface.co/stepfun-ai/Step-Audio-EditX)
|
| 137 |
- Technical report: [arXiv:2511.03601](https://arxiv.org/abs/2511.03601)
|
| 138 |
- More examples: [AppAutomaton](https://github.com/appautomaton)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
## How to Get Started
|
| 63 |
|
| 64 |
+
Download the bundle:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
```bash
|
| 67 |
hf download appautomaton/step-audio-editx-8bit-mlx \
|
|
|
|
| 71 |
**Voice cloning:**
|
| 72 |
|
| 73 |
```bash
|
| 74 |
+
python scripts/generate/step_audio_editx.py \
|
|
|
|
|
|
|
| 75 |
--prompt-audio reference.wav \
|
| 76 |
--prompt-text "Transcript of reference audio." \
|
| 77 |
-o cloned.wav \
|
|
|
|
| 81 |
**Audio editing (change emotion):**
|
| 82 |
|
| 83 |
```bash
|
| 84 |
+
python scripts/generate/step_audio_editx.py \
|
|
|
|
|
|
|
| 85 |
--prompt-audio input.wav \
|
| 86 |
--prompt-text "Transcript of input audio." \
|
| 87 |
-o happy.wav \
|
|
|
|
| 123 |
- Upstream model: [`stepfun-ai/Step-Audio-EditX`](https://huggingface.co/stepfun-ai/Step-Audio-EditX)
|
| 124 |
- Technical report: [arXiv:2511.03601](https://arxiv.org/abs/2511.03601)
|
| 125 |
- More examples: [AppAutomaton](https://github.com/appautomaton)
|
| 126 |
+
|
| 127 |
+
## License
|
| 128 |
+
|
| 129 |
+
Apache 2.0 — following the upstream license published with
|
| 130 |
+
[`stepfun-ai/Step-Audio-EditX`](https://huggingface.co/stepfun-ai/Step-Audio-EditX).
|