anonymous2222 commited on
Commit
d149804
·
verified ·
1 Parent(s): bc18516

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ ---
4
+
5
+ ---
6
+ language:
7
+ - en
8
+ license: apache-2.0
9
+ tags:
10
+ - speech
11
+ - text-to-speech
12
+ - dialogue
13
+ - emotion
14
+ - empathy
15
+ - glm4voice
16
+ - lora
17
+ base_model: THUDM/glm-4-voice-9b
18
+ datasets:
19
+ - anonymous2222/Sympatheia-18k
20
+ ---
21
+
22
+ # Sympatheia — LoRA Checkpoint
23
+
24
+ This is the LoRA adapter checkpoint for **Sympatheia**, an emotionally adaptive speech-to-speech dialogue model submitted to NeurIPS 2026 (anonymous review).
25
+
26
+ [[Paper]](https://anonymous.4open.science/r/sympatheia-1181/sympatheia_neurips_2026.pdf) | [[Demo]](https://anonymous.4open.science/w/sympatheia-1181/) | [[Dataset]](https://huggingface.co/datasets/anonymous2222/Sympatheia-18k) | [[Code]](https://anonymous.4open.science/r/sympatheia-1181)
27
+
28
+ ---
29
+
30
+ ## Model description
31
+
32
+ Sympatheia fine-tunes [GLM-4-Voice-9B](https://huggingface.co/THUDM/glm-4-voice-9b) with LoRA to generate spoken responses conditioned on a continuous **valence–arousal (VA)** affect signal injected into the system prompt as `User emotion (valence=v, arousal=a)`. It is trained on [Sympatheia-18k](https://huggingface.co/datasets/anonymous2222/Sympatheia-18k), a synthetic corpus of 18k emotion-conditioned spoken dialogue pairs spanning 12 emotion anchors (happy, sad, angry, excited, frustrated, anxious, relaxed, surprised, disgusted, tired, content, neutral).
33
+
34
+ When an external VA estimate is available (from face, EEG, physiological signals, or text), it is supplied in the system prompt and guides the response affectively. When absent, the model falls back to affect inferred from the speech input itself.
35
+
36
+ ## Intended use
37
+
38
+ - Research on emotionally adaptive voice assistants.
39
+ - Evaluation of continuous affect conditioning for speech-to-speech dialogue.
40
+ - Integration experiments with external emotion sensing modules.
41
+
42
+ Not intended for: covert emotion sensing, protected-attribute inference, clinical diagnosis, or any deployment without explicit user consent and opt-in affect sensing.
43
+
44
+ ## How to use
45
+
46
+ This checkpoint is a LoRA adapter for GLM-4-Voice-9B. You also need:
47
+ - The GLM-4-Voice-9B base model (THUDM/glm-4-voice-9b)
48
+ - The GLM-4-Voice decoder weights (flow.pt, hift.pt from THUDM/glm-4-voice-decoder)
49
+
50
+ See the project code at https://anonymous.4open.science/r/sympatheia-1181 for full inference and evaluation scripts.
51
+
52
+ # Download this checkpoint
53
+ huggingface-cli download anonymous2222/Sympatheia --local-dir /path/to/checkpoint
54
+
55
+ # Run inference (from the project src/ directory)
56
+ python inference_sympatheia.py --checkpoint /path/to/checkpoint
57
+
58
+ # Interactive Gradio demo
59
+ python gradio_demo.py --checkpoint /path/to/checkpoint --port 7860
60
+
61
+ ## Training data
62
+
63
+ Sympatheia-18k (https://huggingface.co/datasets/anonymous2222/Sympatheia-18k) — 18k synthetic emotion-conditioned spoken dialogue pairs (Emotional split: ~12k; Neutral split: ~6k). Generated with Qwen3-32B (text) and Qwen3-TTS (speech).
64
+
65
+ ## Training procedure
66
+
67
+ LoRA fine-tuning of GLM-4-Voice-9B with DeepSpeed ZeRO Stage 3, BF16 precision. See src/config.yaml in the project code for full hyperparameter details.
68
+
69
+ ## License
70
+
71
+ Apache 2.0. The GLM-4-Voice-9B base model is subject to the GLM-4-Voice License (https://huggingface.co/THUDM/glm-4-voice-9b).