Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-4.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-4.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
language:
|
| 7 |
+
- en
|
| 8 |
+
license: apache-2.0
|
| 9 |
+
tags:
|
| 10 |
+
- speech
|
| 11 |
+
- text-to-speech
|
| 12 |
+
- dialogue
|
| 13 |
+
- emotion
|
| 14 |
+
- empathy
|
| 15 |
+
- glm4voice
|
| 16 |
+
- lora
|
| 17 |
+
base_model: THUDM/glm-4-voice-9b
|
| 18 |
+
datasets:
|
| 19 |
+
- anonymous2222/Sympatheia-18k
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
# Sympatheia — LoRA Checkpoint
|
| 23 |
+
|
| 24 |
+
This is the LoRA adapter checkpoint for **Sympatheia**, an emotionally adaptive speech-to-speech dialogue model submitted to NeurIPS 2026 (anonymous review).
|
| 25 |
+
|
| 26 |
+
[[Paper]](https://anonymous.4open.science/r/sympatheia-1181/sympatheia_neurips_2026.pdf) | [[Demo]](https://anonymous.4open.science/w/sympatheia-1181/) | [[Dataset]](https://huggingface.co/datasets/anonymous2222/Sympatheia-18k) | [[Code]](https://anonymous.4open.science/r/sympatheia-1181)
|
| 27 |
+
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
## Model description
|
| 31 |
+
|
| 32 |
+
Sympatheia fine-tunes [GLM-4-Voice-9B](https://huggingface.co/THUDM/glm-4-voice-9b) with LoRA to generate spoken responses conditioned on a continuous **valence–arousal (VA)** affect signal injected into the system prompt as `User emotion (valence=v, arousal=a)`. It is trained on [Sympatheia-18k](https://huggingface.co/datasets/anonymous2222/Sympatheia-18k), a synthetic corpus of 18k emotion-conditioned spoken dialogue pairs spanning 12 emotion anchors (happy, sad, angry, excited, frustrated, anxious, relaxed, surprised, disgusted, tired, content, neutral).
|
| 33 |
+
|
| 34 |
+
When an external VA estimate is available (from face, EEG, physiological signals, or text), it is supplied in the system prompt and guides the response affectively. When absent, the model falls back to affect inferred from the speech input itself.
|
| 35 |
+
|
| 36 |
+
## Intended use
|
| 37 |
+
|
| 38 |
+
- Research on emotionally adaptive voice assistants.
|
| 39 |
+
- Evaluation of continuous affect conditioning for speech-to-speech dialogue.
|
| 40 |
+
- Integration experiments with external emotion sensing modules.
|
| 41 |
+
|
| 42 |
+
Not intended for: covert emotion sensing, protected-attribute inference, clinical diagnosis, or any deployment without explicit user consent and opt-in affect sensing.
|
| 43 |
+
|
| 44 |
+
## How to use
|
| 45 |
+
|
| 46 |
+
This checkpoint is a LoRA adapter for GLM-4-Voice-9B. You also need:
|
| 47 |
+
- The GLM-4-Voice-9B base model (THUDM/glm-4-voice-9b)
|
| 48 |
+
- The GLM-4-Voice decoder weights (flow.pt, hift.pt from THUDM/glm-4-voice-decoder)
|
| 49 |
+
|
| 50 |
+
See the project code at https://anonymous.4open.science/r/sympatheia-1181 for full inference and evaluation scripts.
|
| 51 |
+
|
| 52 |
+
# Download this checkpoint
|
| 53 |
+
huggingface-cli download anonymous2222/Sympatheia --local-dir /path/to/checkpoint
|
| 54 |
+
|
| 55 |
+
# Run inference (from the project src/ directory)
|
| 56 |
+
python inference_sympatheia.py --checkpoint /path/to/checkpoint
|
| 57 |
+
|
| 58 |
+
# Interactive Gradio demo
|
| 59 |
+
python gradio_demo.py --checkpoint /path/to/checkpoint --port 7860
|
| 60 |
+
|
| 61 |
+
## Training data
|
| 62 |
+
|
| 63 |
+
Sympatheia-18k (https://huggingface.co/datasets/anonymous2222/Sympatheia-18k) — 18k synthetic emotion-conditioned spoken dialogue pairs (Emotional split: ~12k; Neutral split: ~6k). Generated with Qwen3-32B (text) and Qwen3-TTS (speech).
|
| 64 |
+
|
| 65 |
+
## Training procedure
|
| 66 |
+
|
| 67 |
+
LoRA fine-tuning of GLM-4-Voice-9B with DeepSpeed ZeRO Stage 3, BF16 precision. See src/config.yaml in the project code for full hyperparameter details.
|
| 68 |
+
|
| 69 |
+
## License
|
| 70 |
+
|
| 71 |
+
Apache 2.0. The GLM-4-Voice-9B base model is subject to the GLM-4-Voice License (https://huggingface.co/THUDM/glm-4-voice-9b).
|