praxis-briefing

Sleeping

App Files Files Community

jempf commited on May 18

Commit

8360af0

verified ·

1 Parent(s): dd40361

README: document English-only base, why fine-tune is required

Browse files

Files changed (1) hide show

README.md +14 -14

README.md CHANGED Viewed

@@ -16,12 +16,15 @@ short_description: Peitho — German voice assistant, 1.5B params (private)
 A German speech-to-speech voice assistant, named after Peitho — Greek goddess of
 persuasion and eloquence.
-**Current variant: stock base model (no v6 overlay).** A/B test to find out
-whether our v6 fine-tune is actually helping. Earlier rounds suggested the
-base model often gives better German answers than v6 to identical prompts,
-so this variant ships base + persona-only system prompt to validate.
-Runs as a WebRTC push-to-talk chat.
 ## Hardware / cost notes
@@ -36,20 +39,17 @@ few dollars per month at most.
 ## How it works
-`app.py` loads stock `LFM2.5-Audio-1.5B` via `liquid_audio.demo.model` and
-hands it to `chat.py`. No checkpoint overlay, no extra model instance — the
-simplest possible setup.
 Other knobs:
-- German "Du bist Peitho" persona system prompt
 - Sampled decoding (`temp=0.3, topk=10`) for natural answers
 - Cloudflare TURN servers via the Space's HF token (set as `HF_TOKEN` secret)
 - VAD-based push-to-talk via fastrtc
-To revert to v6 (or any other fine-tune), `git revert` this commit — the
-previous version of `app.py` overlaid v6 weights from `jempf/peitho-1.5b-v6`.
 ## Setup checklist
 After pushing these files to the Space:
@@ -61,5 +61,5 @@ After pushing these files to the Space:
 3. **Settings → Sleep time**: 300 (= 5 minutes).
 4. **Settings → Visibility**: keep private.
-The Space will then build, load the base model, and start serving on its
-private URL.

 A German speech-to-speech voice assistant, named after Peitho — Greek goddess of
 persuasion and eloquence.
+Private demo of [`jempf/peitho-1.5b-v6`](https://huggingface.co/jempf/peitho-1.5b-v6),
+a 1.5B-parameter audio LM fine-tuned for German chat behavior. Runs as a
+WebRTC push-to-talk chat.
+**Why the fine-tune is non-optional:** the stock base model is English-only
+for speech-to-speech (see model card). An A/B test with the unmodified base
+produced garbled mixed-language text and robotic German audio. The fine-tune
+is what teaches the model German audio output, so v7 plans should improve
+*on top of* v6, not replace it with the base model.
 ## Hardware / cost notes
 ## How it works
+`app.py` loads stock `LFM2.5-Audio-1.5B` via `liquid_audio.demo.model`, then
+overlays v6 weights from `jempf/peitho-1.5b-v6` using
+`accelerate.load_checkpoint_in_model`.
 Other knobs:
+- German "Du bist Peitho. Antworte in einem Satz." system prompt
 - Sampled decoding (`temp=0.3, topk=10`) for natural answers
 - Cloudflare TURN servers via the Space's HF token (set as `HF_TOKEN` secret)
 - VAD-based push-to-talk via fastrtc
 ## Setup checklist
 After pushing these files to the Space:
 3. **Settings → Sleep time**: 300 (= 5 minutes).
 4. **Settings → Visibility**: keep private.
+The Space will then build, download v6 weights from the model repo, and start
+serving on its private URL.