--- license: mit --- # Spanish-F5 TTS Inference API (Hugging Face) This project exposes a Hugging Face Inference Endpoint for Spanish-F5, a Spanish-adapted version of the F5-TTS model. It takes reference audio and a target sentence, and synthesizes speech in the same voice. > ✨ Live inference is powered by Hugging Face Inference Endpoints. --- ## 🔗 Credit This project is based on [jpgallegoar/Spanish-F5](https://github.com/jpgallegoar/Spanish-F5). - Spanish-F5 by [@jpgallegoar](https://github.com/jpgallegoar) - [Model weights on Hugging Face](https://huggingface.co/jpgallegoar/F5-Spanish/) Addapted by [@eloicito333](https://github.com/eloicito333). Licensed under the MIT License. --- ## ⚙️ How It Works ### 🔽 Request Parameters Send a POST request with a JSON body to the Hugging Face Inference Endpoint: ```json { "ref_audio": "", // string, required "ref_text": "Hola, ¿cómo estás?", // string, optional (transcript of ref_audio) "gen_text": "Estoy muy bien, gracias.", // string, required (text to synthesize) "remove_silence": true, // boolean, optional (default: true) "speed": 1.0, // number, optional (default: 1.0) "cross_fade_duration": 0.15 // number, optional (default: 0.15) } ``` ### 🔼 Response Object The response will be a JSON object: ```json { "success": true, // boolean: true if synthesis succeeded "audio_base64": "" // string: base64 WAV audio (if success) } ``` If an error occurs: ```json { "success": false, "error": "TypeError: some descriptive message" // string: error description } ``` Use the `audio_base64` field to decode and save the resulting audio. --- ## 🤖 Node.js Client Example (Using Fetch) ```js import "fs" async function sendAudio() { const audioBuffer = fs.readFileSync("./example.wav"); const audioBase64 = audioBuffer.toString("base64"); const response = await fetch("https://your-hf-endpoint-url", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ ref_audio: audioBase64, ref_text: "Hola, ¿cómo estás?", gen_text: "Estoy muy bien, gracias.", remove_silence: true, speed: 1.0, cross_fade_duration: 0.15, }) }); const result = await response.json(); if (result.audio_base64) { fs.writeFileSync("output.wav", Buffer.from(result.audio_base64, "base64")); console.log("Audio saved to output.wav"); } else { console.error("Error:", result); } } sendAudio(); ``` --- ## 🔬 Python Client Example (Optional) ```python import requests import base64 with open("ref.wav", "rb") as f: audio_base64 = base64.b64encode(f.read()).decode("utf-8") response = requests.post("https://your-hf-endpoint-url", json={ "ref_audio": audio_base64, "ref_text": "Hola, ¿cómo estás?", "gen_text": "Estoy muy bien, gracias.", "remove_silence": True, "speed": 1.0, "cross_fade_duration": 0.15 }) if response.ok and response.json().get("audio_base64"): with open("output.wav", "wb") as out: out.write(base64.b64decode(response.json()["audio_base64"])) print("Audio saved to output.wav") else: print("Error:", response.json()) ``` --- ## 🎓 License MIT License. See [LICENSE](./LICENSE) for more information. --- ## ✏️ Author Addapted by [@eloicito333](https://github.com/eloicito333).