jempf commited on
Commit
8360af0
·
verified ·
1 Parent(s): dd40361

README: document English-only base, why fine-tune is required

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -16,12 +16,15 @@ short_description: Peitho — German voice assistant, 1.5B params (private)
16
  A German speech-to-speech voice assistant, named after Peitho — Greek goddess of
17
  persuasion and eloquence.
18
 
19
- **Current variant: stock base model (no v6 overlay).** A/B test to find out
20
- whether our v6 fine-tune is actually helping. Earlier rounds suggested the
21
- base model often gives better German answers than v6 to identical prompts,
22
- so this variant ships base + persona-only system prompt to validate.
23
 
24
- Runs as a WebRTC push-to-talk chat.
 
 
 
 
25
 
26
  ## Hardware / cost notes
27
 
@@ -36,20 +39,17 @@ few dollars per month at most.
36
 
37
  ## How it works
38
 
39
- `app.py` loads stock `LFM2.5-Audio-1.5B` via `liquid_audio.demo.model` and
40
- hands it to `chat.py`. No checkpoint overlay, no extra model instance — the
41
- simplest possible setup.
42
 
43
  Other knobs:
44
 
45
- - German "Du bist Peitho" persona system prompt
46
  - Sampled decoding (`temp=0.3, topk=10`) for natural answers
47
  - Cloudflare TURN servers via the Space's HF token (set as `HF_TOKEN` secret)
48
  - VAD-based push-to-talk via fastrtc
49
 
50
- To revert to v6 (or any other fine-tune), `git revert` this commit — the
51
- previous version of `app.py` overlaid v6 weights from `jempf/peitho-1.5b-v6`.
52
-
53
  ## Setup checklist
54
 
55
  After pushing these files to the Space:
@@ -61,5 +61,5 @@ After pushing these files to the Space:
61
  3. **Settings → Sleep time**: 300 (= 5 minutes).
62
  4. **Settings → Visibility**: keep private.
63
 
64
- The Space will then build, load the base model, and start serving on its
65
- private URL.
 
16
  A German speech-to-speech voice assistant, named after Peitho — Greek goddess of
17
  persuasion and eloquence.
18
 
19
+ Private demo of [`jempf/peitho-1.5b-v6`](https://huggingface.co/jempf/peitho-1.5b-v6),
20
+ a 1.5B-parameter audio LM fine-tuned for German chat behavior. Runs as a
21
+ WebRTC push-to-talk chat.
 
22
 
23
+ **Why the fine-tune is non-optional:** the stock base model is English-only
24
+ for speech-to-speech (see model card). An A/B test with the unmodified base
25
+ produced garbled mixed-language text and robotic German audio. The fine-tune
26
+ is what teaches the model German audio output, so v7 plans should improve
27
+ *on top of* v6, not replace it with the base model.
28
 
29
  ## Hardware / cost notes
30
 
 
39
 
40
  ## How it works
41
 
42
+ `app.py` loads stock `LFM2.5-Audio-1.5B` via `liquid_audio.demo.model`, then
43
+ overlays v6 weights from `jempf/peitho-1.5b-v6` using
44
+ `accelerate.load_checkpoint_in_model`.
45
 
46
  Other knobs:
47
 
48
+ - German "Du bist Peitho. Antworte in einem Satz." system prompt
49
  - Sampled decoding (`temp=0.3, topk=10`) for natural answers
50
  - Cloudflare TURN servers via the Space's HF token (set as `HF_TOKEN` secret)
51
  - VAD-based push-to-talk via fastrtc
52
 
 
 
 
53
  ## Setup checklist
54
 
55
  After pushing these files to the Space:
 
61
  3. **Settings → Sleep time**: 300 (= 5 minutes).
62
  4. **Settings → Visibility**: keep private.
63
 
64
+ The Space will then build, download v6 weights from the model repo, and start
65
+ serving on its private URL.