File size: 3,728 Bytes
7104330
 
 
3e56a3f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---

license: mit
---

# Spanish-F5 TTS Inference API (Hugging Face)

This project exposes a Hugging Face Inference Endpoint for Spanish-F5, a Spanish-adapted version of the F5-TTS model. It takes reference audio and a target sentence, and synthesizes speech in the same voice.

> ✨ Live inference is powered by Hugging Face Inference Endpoints.

---

## 🔗 Credit

This project is based on [jpgallegoar/Spanish-F5](https://github.com/jpgallegoar/Spanish-F5).

- Spanish-F5 by [@jpgallegoar](https://github.com/jpgallegoar)
- [Model weights on Hugging Face](https://huggingface.co/jpgallegoar/F5-Spanish/)

Addapted by [@eloicito333](https://github.com/eloicito333).

Licensed under the MIT License.

---

## ⚙️ How It Works

### 🔽 Request Parameters

Send a POST request with a JSON body to the Hugging Face Inference Endpoint:

```json

{

  "ref_audio": "<base64-encoded WAV>",               // string, required

  "ref_text": "Hola, ¿cómo estás?",                 // string, optional (transcript of ref_audio)

  "gen_text": "Estoy muy bien, gracias.",           // string, required (text to synthesize)

  "remove_silence": true,                            // boolean, optional (default: true)

  "speed": 1.0,                                       // number, optional (default: 1.0)

  "cross_fade_duration": 0.15                        // number, optional (default: 0.15)

}

```

### 🔼 Response Object

The response will be a JSON object:

```json

{

  "success": true,                                     // boolean: true if synthesis succeeded

  "audio_base64": "<base64-encoded WAV output>"       // string: base64 WAV audio (if success)

}

```

If an error occurs:

```json

{

  "success": false,

  "error": "TypeError: some descriptive message"       // string: error description

}

```

Use the `audio_base64` field to decode and save the resulting audio.

---

## 🤖 Node.js Client Example (Using Fetch)

```js

import "fs"



async function sendAudio() {

  const audioBuffer = fs.readFileSync("./example.wav");

  const audioBase64 = audioBuffer.toString("base64");



  const response = await fetch("https://your-hf-endpoint-url", {

    method: "POST",

    headers: { "Content-Type": "application/json" },

    body: JSON.stringify({

      ref_audio: audioBase64,

      ref_text: "Hola, ¿cómo estás?",

      gen_text: "Estoy muy bien, gracias.",

      remove_silence: true,

      speed: 1.0,

      cross_fade_duration: 0.15,

    })

  });



  const result = await response.json();



  if (result.audio_base64) {

    fs.writeFileSync("output.wav", Buffer.from(result.audio_base64, "base64"));

    console.log("Audio saved to output.wav");

  } else {

    console.error("Error:", result);

  }

}



sendAudio();

```

---

## 🔬 Python Client Example (Optional)

```python

import requests

import base64



with open("ref.wav", "rb") as f:

    audio_base64 = base64.b64encode(f.read()).decode("utf-8")



response = requests.post("https://your-hf-endpoint-url", json={

    "ref_audio": audio_base64,

    "ref_text": "Hola, ¿cómo estás?",

    "gen_text": "Estoy muy bien, gracias.",

    "remove_silence": True,

    "speed": 1.0,

    "cross_fade_duration": 0.15

})



if response.ok and response.json().get("audio_base64"):

    with open("output.wav", "wb") as out:

        out.write(base64.b64decode(response.json()["audio_base64"]))

    print("Audio saved to output.wav")

else:

    print("Error:", response.json())

```

---

## 🎓 License

MIT License. See [LICENSE](./LICENSE) for more information.

---

## ✏️ Author

Addapted by [@eloicito333](https://github.com/eloicito333).