README: document English-only base, why fine-tune is required
Browse files
README.md
CHANGED
|
@@ -16,12 +16,15 @@ short_description: Peitho — German voice assistant, 1.5B params (private)
|
|
| 16 |
A German speech-to-speech voice assistant, named after Peitho — Greek goddess of
|
| 17 |
persuasion and eloquence.
|
| 18 |
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
so this variant ships base + persona-only system prompt to validate.
|
| 23 |
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
## Hardware / cost notes
|
| 27 |
|
|
@@ -36,20 +39,17 @@ few dollars per month at most.
|
|
| 36 |
|
| 37 |
## How it works
|
| 38 |
|
| 39 |
-
`app.py` loads stock `LFM2.5-Audio-1.5B` via `liquid_audio.demo.model`
|
| 40 |
-
|
| 41 |
-
|
| 42 |
|
| 43 |
Other knobs:
|
| 44 |
|
| 45 |
-
- German "Du bist Peitho
|
| 46 |
- Sampled decoding (`temp=0.3, topk=10`) for natural answers
|
| 47 |
- Cloudflare TURN servers via the Space's HF token (set as `HF_TOKEN` secret)
|
| 48 |
- VAD-based push-to-talk via fastrtc
|
| 49 |
|
| 50 |
-
To revert to v6 (or any other fine-tune), `git revert` this commit — the
|
| 51 |
-
previous version of `app.py` overlaid v6 weights from `jempf/peitho-1.5b-v6`.
|
| 52 |
-
|
| 53 |
## Setup checklist
|
| 54 |
|
| 55 |
After pushing these files to the Space:
|
|
@@ -61,5 +61,5 @@ After pushing these files to the Space:
|
|
| 61 |
3. **Settings → Sleep time**: 300 (= 5 minutes).
|
| 62 |
4. **Settings → Visibility**: keep private.
|
| 63 |
|
| 64 |
-
The Space will then build,
|
| 65 |
-
private URL.
|
|
|
|
| 16 |
A German speech-to-speech voice assistant, named after Peitho — Greek goddess of
|
| 17 |
persuasion and eloquence.
|
| 18 |
|
| 19 |
+
Private demo of [`jempf/peitho-1.5b-v6`](https://huggingface.co/jempf/peitho-1.5b-v6),
|
| 20 |
+
a 1.5B-parameter audio LM fine-tuned for German chat behavior. Runs as a
|
| 21 |
+
WebRTC push-to-talk chat.
|
|
|
|
| 22 |
|
| 23 |
+
**Why the fine-tune is non-optional:** the stock base model is English-only
|
| 24 |
+
for speech-to-speech (see model card). An A/B test with the unmodified base
|
| 25 |
+
produced garbled mixed-language text and robotic German audio. The fine-tune
|
| 26 |
+
is what teaches the model German audio output, so v7 plans should improve
|
| 27 |
+
*on top of* v6, not replace it with the base model.
|
| 28 |
|
| 29 |
## Hardware / cost notes
|
| 30 |
|
|
|
|
| 39 |
|
| 40 |
## How it works
|
| 41 |
|
| 42 |
+
`app.py` loads stock `LFM2.5-Audio-1.5B` via `liquid_audio.demo.model`, then
|
| 43 |
+
overlays v6 weights from `jempf/peitho-1.5b-v6` using
|
| 44 |
+
`accelerate.load_checkpoint_in_model`.
|
| 45 |
|
| 46 |
Other knobs:
|
| 47 |
|
| 48 |
+
- German "Du bist Peitho. Antworte in einem Satz." system prompt
|
| 49 |
- Sampled decoding (`temp=0.3, topk=10`) for natural answers
|
| 50 |
- Cloudflare TURN servers via the Space's HF token (set as `HF_TOKEN` secret)
|
| 51 |
- VAD-based push-to-talk via fastrtc
|
| 52 |
|
|
|
|
|
|
|
|
|
|
| 53 |
## Setup checklist
|
| 54 |
|
| 55 |
After pushing these files to the Space:
|
|
|
|
| 61 |
3. **Settings → Sleep time**: 300 (= 5 minutes).
|
| 62 |
4. **Settings → Visibility**: keep private.
|
| 63 |
|
| 64 |
+
The Space will then build, download v6 weights from the model repo, and start
|
| 65 |
+
serving on its private URL.
|