eloicito333 commited on
Commit
3e56a3f
·
1 Parent(s): df516aa

add LICENSE file and update README with project details and usage examples

Browse files
Files changed (2) hide show
  1. LICENSE +21 -0
  2. README.md +136 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Eloi Buil Cuadrat
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,3 +1,139 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ # Spanish-F5 TTS Inference API (Hugging Face)
5
+
6
+ This project exposes a Hugging Face Inference Endpoint for Spanish-F5, a Spanish-adapted version of the F5-TTS model. It takes reference audio and a target sentence, and synthesizes speech in the same voice.
7
+
8
+ > ✨ Live inference is powered by Hugging Face Inference Endpoints.
9
+
10
+ ---
11
+
12
+ ## 🔗 Credit
13
+
14
+ This project is based on [jpgallegoar/Spanish-F5](https://github.com/jpgallegoar/Spanish-F5).
15
+
16
+ - Spanish-F5 by [@jpgallegoar](https://github.com/jpgallegoar)
17
+ - [Model weights on Hugging Face](https://huggingface.co/jpgallegoar/F5-Spanish/)
18
+
19
+ Addapted by [@eloicito333](https://github.com/eloicito333).
20
+
21
+ Licensed under the MIT License.
22
+
23
+ ---
24
+
25
+ ## ⚙️ How It Works
26
+
27
+ ### 🔽 Request Parameters
28
+
29
+ Send a POST request with a JSON body to the Hugging Face Inference Endpoint:
30
+
31
+ ```json
32
+ {
33
+ "ref_audio": "<base64-encoded WAV>", // string, required
34
+ "ref_text": "Hola, ¿cómo estás?", // string, optional (transcript of ref_audio)
35
+ "gen_text": "Estoy muy bien, gracias.", // string, required (text to synthesize)
36
+ "remove_silence": true, // boolean, optional (default: true)
37
+ "speed": 1.0, // number, optional (default: 1.0)
38
+ "cross_fade_duration": 0.15 // number, optional (default: 0.15)
39
+ }
40
+ ```
41
+
42
+ ### 🔼 Response Object
43
+
44
+ The response will be a JSON object:
45
+
46
+ ```json
47
+ {
48
+ "success": true, // boolean: true if synthesis succeeded
49
+ "audio_base64": "<base64-encoded WAV output>" // string: base64 WAV audio (if success)
50
+ }
51
+ ```
52
+
53
+ If an error occurs:
54
+
55
+ ```json
56
+ {
57
+ "success": false,
58
+ "error": "TypeError: some descriptive message" // string: error description
59
+ }
60
+ ```
61
+
62
+ Use the `audio_base64` field to decode and save the resulting audio.
63
+
64
+ ---
65
+
66
+ ## 🤖 Node.js Client Example (Using Fetch)
67
+
68
+ ```js
69
+ import "fs"
70
+
71
+ async function sendAudio() {
72
+ const audioBuffer = fs.readFileSync("./example.wav");
73
+ const audioBase64 = audioBuffer.toString("base64");
74
+
75
+ const response = await fetch("https://your-hf-endpoint-url", {
76
+ method: "POST",
77
+ headers: { "Content-Type": "application/json" },
78
+ body: JSON.stringify({
79
+ ref_audio: audioBase64,
80
+ ref_text: "Hola, ¿cómo estás?",
81
+ gen_text: "Estoy muy bien, gracias.",
82
+ remove_silence: true,
83
+ speed: 1.0,
84
+ cross_fade_duration: 0.15,
85
+ })
86
+ });
87
+
88
+ const result = await response.json();
89
+
90
+ if (result.audio_base64) {
91
+ fs.writeFileSync("output.wav", Buffer.from(result.audio_base64, "base64"));
92
+ console.log("Audio saved to output.wav");
93
+ } else {
94
+ console.error("Error:", result);
95
+ }
96
+ }
97
+
98
+ sendAudio();
99
+ ```
100
+
101
+ ---
102
+
103
+ ## 🔬 Python Client Example (Optional)
104
+
105
+ ```python
106
+ import requests
107
+ import base64
108
+
109
+ with open("ref.wav", "rb") as f:
110
+ audio_base64 = base64.b64encode(f.read()).decode("utf-8")
111
+
112
+ response = requests.post("https://your-hf-endpoint-url", json={
113
+ "ref_audio": audio_base64,
114
+ "ref_text": "Hola, ¿cómo estás?",
115
+ "gen_text": "Estoy muy bien, gracias.",
116
+ "remove_silence": True,
117
+ "speed": 1.0,
118
+ "cross_fade_duration": 0.15
119
+ })
120
+
121
+ if response.ok and response.json().get("audio_base64"):
122
+ with open("output.wav", "wb") as out:
123
+ out.write(base64.b64decode(response.json()["audio_base64"]))
124
+ print("Audio saved to output.wav")
125
+ else:
126
+ print("Error:", response.json())
127
+ ```
128
+
129
+ ---
130
+
131
+ ## 🎓 License
132
+
133
+ MIT License. See [LICENSE](./LICENSE) for more information.
134
+
135
+ ---
136
+
137
+ ## ✏️ Author
138
+
139
+ Addapted by [@eloicito333](https://github.com/eloicito333).