|
|
--- |
|
|
library_name: transformers |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- Qwen/Qwen2-Audio-7B-Instruct |
|
|
pipeline_tag: audio-text-to-text |
|
|
tags: |
|
|
- lora |
|
|
license: cc-by-nc-4.0 |
|
|
--- |
|
|
|
|
|
# StresSLM |
|
|
|
|
|
**StresSLM** is an audio-text-to-text model fine-tuned with LoRA adapters on top of the [`Qwen/Qwen2-Audio-7B-Instruct`](https://huggingface.co/Qwen/Qwen2-Audio-7B-Instruct) base model. It is designed to tackle **Sentence Stress Detection (SSD)** and **Sentence Stress Reasoning (SSR)** tasks on the StressTest benchmark. |
|
|
StresSLM predicts **stress patterns** and **reasoning** based on spoken audio. |
|
|
For more information, see our paper and code: |
|
|
|
|
|
๐ป [Code](https://github.com/slp-rl/StressTest) | ๐ค [StressTest Dataset](https://huggingface.co/datasets/slprl/StressTest) | ๐ค [Stress-17k Dataset](https://huggingface.co/datasets/slprl/Stress-17K-raw) |
|
|
|
|
|
๐ [StressTest Paper](https://arxiv.org/abs/2505.22765) | ๐ [Project Page](https://pages.cs.huji.ac.il/adiyoss-lab/stresstest/) |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
This model can be loaded using the HuggingFace Transformers library: |
|
|
|
|
|
```python |
|
|
from transformers import AutoProcessor, Qwen2AudioForConditionalGeneration |
|
|
from peft import PeftModel, PeftConfig |
|
|
|
|
|
# Load processor |
|
|
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-Audio-7B-Instruct") |
|
|
|
|
|
# Load LoRA config and base model |
|
|
peft_config = PeftConfig.from_pretrained("slprl/StresSLM") |
|
|
base_model = Qwen2AudioForConditionalGeneration.from_pretrained(peft_config.base_model_name_or_path) |
|
|
|
|
|
# Load LoRA adapter |
|
|
model = PeftModel.from_pretrained(base_model, "slprl/StresSLM") |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Tasks |
|
|
|
|
|
* **Sentence Stress Detection (SSD)**: Identify stressed words in an utterance. |
|
|
* **Sentence Stress Reasoning (SSR)**: Reason about the speakerโs intention using stress patterns. |
|
|
|
|
|
For evaluation scripts and benchmarks, refer to the [StressTest GitHub repository](https://github.com/slp-rl/StressTest). |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{yosha2025stresstest, |
|
|
title={StressTest: Can YOUR Speech LM Handle the Stress?}, |
|
|
author={Iddo Yosha and Gallil Maimon and Yossi Adi}, |
|
|
year={2025}, |
|
|
eprint={2505.22765}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2505.22765}, |
|
|
} |
|
|
``` |