File size: 3,728 Bytes
7104330 3e56a3f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: mit
---
# Spanish-F5 TTS Inference API (Hugging Face)
This project exposes a Hugging Face Inference Endpoint for Spanish-F5, a Spanish-adapted version of the F5-TTS model. It takes reference audio and a target sentence, and synthesizes speech in the same voice.
> ✨ Live inference is powered by Hugging Face Inference Endpoints.
---
## 🔗 Credit
This project is based on [jpgallegoar/Spanish-F5](https://github.com/jpgallegoar/Spanish-F5).
- Spanish-F5 by [@jpgallegoar](https://github.com/jpgallegoar)
- [Model weights on Hugging Face](https://huggingface.co/jpgallegoar/F5-Spanish/)
Addapted by [@eloicito333](https://github.com/eloicito333).
Licensed under the MIT License.
---
## ⚙️ How It Works
### 🔽 Request Parameters
Send a POST request with a JSON body to the Hugging Face Inference Endpoint:
```json
{
"ref_audio": "<base64-encoded WAV>", // string, required
"ref_text": "Hola, ¿cómo estás?", // string, optional (transcript of ref_audio)
"gen_text": "Estoy muy bien, gracias.", // string, required (text to synthesize)
"remove_silence": true, // boolean, optional (default: true)
"speed": 1.0, // number, optional (default: 1.0)
"cross_fade_duration": 0.15 // number, optional (default: 0.15)
}
```
### 🔼 Response Object
The response will be a JSON object:
```json
{
"success": true, // boolean: true if synthesis succeeded
"audio_base64": "<base64-encoded WAV output>" // string: base64 WAV audio (if success)
}
```
If an error occurs:
```json
{
"success": false,
"error": "TypeError: some descriptive message" // string: error description
}
```
Use the `audio_base64` field to decode and save the resulting audio.
---
## 🤖 Node.js Client Example (Using Fetch)
```js
import "fs"
async function sendAudio() {
const audioBuffer = fs.readFileSync("./example.wav");
const audioBase64 = audioBuffer.toString("base64");
const response = await fetch("https://your-hf-endpoint-url", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
ref_audio: audioBase64,
ref_text: "Hola, ¿cómo estás?",
gen_text: "Estoy muy bien, gracias.",
remove_silence: true,
speed: 1.0,
cross_fade_duration: 0.15,
})
});
const result = await response.json();
if (result.audio_base64) {
fs.writeFileSync("output.wav", Buffer.from(result.audio_base64, "base64"));
console.log("Audio saved to output.wav");
} else {
console.error("Error:", result);
}
}
sendAudio();
```
---
## 🔬 Python Client Example (Optional)
```python
import requests
import base64
with open("ref.wav", "rb") as f:
audio_base64 = base64.b64encode(f.read()).decode("utf-8")
response = requests.post("https://your-hf-endpoint-url", json={
"ref_audio": audio_base64,
"ref_text": "Hola, ¿cómo estás?",
"gen_text": "Estoy muy bien, gracias.",
"remove_silence": True,
"speed": 1.0,
"cross_fade_duration": 0.15
})
if response.ok and response.json().get("audio_base64"):
with open("output.wav", "wb") as out:
out.write(base64.b64decode(response.json()["audio_base64"]))
print("Audio saved to output.wav")
else:
print("Error:", response.json())
```
---
## 🎓 License
MIT License. See [LICENSE](./LICENSE) for more information.
---
## ✏️ Author
Addapted by [@eloicito333](https://github.com/eloicito333). |