JiachengPang commited on
Commit
b63adb3
·
verified ·
1 Parent(s): fc5ae18

License: link to in-repo file

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  license: other
3
  license_name: usc-research
4
- license_link: https://github.com/ihp-lab/VoxParadox/blob/main/LICENSE
5
  language:
6
  - en
7
  library_name: transformers
@@ -19,6 +19,14 @@ pipeline_tag: audio-text-to-text
19
 
20
  # Qwen2-Audio + PCLM + DPO
21
 
 
 
 
 
 
 
 
 
22
  PCLM- and DPO-finetuned [Qwen2-Audio-7B-Instruct](https://huggingface.co/Qwen/Qwen2-Audio-7B-Instruct) from
23
  *Do Audio LLMs Listen or Read? Analyzing and Mitigating Paralinguistic Failures with VoxParadox*
24
  (ICML 2026).
@@ -59,6 +67,16 @@ python eval.py --predictions runs/eval/qwen2audio_pclm_dpo/predictions.jsonl
59
  The loader auto-detects `use_pclm=True` from `config.json` and activates PCLM with
60
  `expose_layers=[5, 15, 25, 30]` over the audio encoder.
61
 
 
 
 
 
 
 
 
 
 
 
62
  ## Citation
63
 
64
  ```bibtex
@@ -72,8 +90,7 @@ The loader auto-detects `use_pclm=True` from `config.json` and activates PCLM wi
72
 
73
  ## License
74
 
75
- USC Research License (research / non-profit only). See the
76
- [license file](https://github.com/ihp-lab/VoxParadox/blob/main/LICENSE).
77
 
78
  The base model (`Qwen/Qwen2-Audio-7B-Instruct`) carries its own Tongyi Qianwen license terms,
79
  which continue to apply to the inherited weights.
 
1
  ---
2
  license: other
3
  license_name: usc-research
4
+ license_link: LICENSE
5
  language:
6
  - en
7
  library_name: transformers
 
19
 
20
  # Qwen2-Audio + PCLM + DPO
21
 
22
+ [![ICML 2026](https://img.shields.io/badge/ICML-2026-1d4ed8.svg)](https://icml.cc/Conferences/2026)
23
+ [![Paper](https://img.shields.io/badge/Paper-arXiv-AD1C18.svg)](https://arxiv.org/abs/2605.27772)
24
+ [![Project Page](https://img.shields.io/badge/Project-Page-0EA5E9.svg)](https://voxparadox.github.io/)
25
+ [![Code](https://img.shields.io/badge/GitHub-ihp--lab%2FVoxParadox-181717.svg?logo=github)](https://github.com/ihp-lab/VoxParadox)
26
+ [![Dataset](https://img.shields.io/badge/🤗%20Dataset-IHP--Lab%2FVoxParadox-FFD21E.svg)](https://huggingface.co/datasets/IHP-Lab/VoxParadox)
27
+ [![AF3 + PCLM + DPO](https://img.shields.io/badge/🤗%20Sibling%20model-AF3+PCLM+DPO-FFD21E.svg)](https://huggingface.co/IHP-Lab/AF3_PCLM_DPO)
28
+ [![License](https://img.shields.io/badge/License-USC%20Research-228B22.svg)](LICENSE)
29
+
30
  PCLM- and DPO-finetuned [Qwen2-Audio-7B-Instruct](https://huggingface.co/Qwen/Qwen2-Audio-7B-Instruct) from
31
  *Do Audio LLMs Listen or Read? Analyzing and Mitigating Paralinguistic Failures with VoxParadox*
32
  (ICML 2026).
 
67
  The loader auto-detects `use_pclm=True` from `config.json` and activates PCLM with
68
  `expose_layers=[5, 15, 25, 30]` over the audio encoder.
69
 
70
+ ## Project resources
71
+
72
+ | Resource | Link |
73
+ |---|---|
74
+ | Paper (arXiv) | <https://arxiv.org/abs/2605.27772> |
75
+ | Project page | <https://voxparadox.github.io/> |
76
+ | Code | <https://github.com/ihp-lab/VoxParadox> |
77
+ | Benchmark | <https://huggingface.co/datasets/IHP-Lab/VoxParadox> |
78
+ | Sibling model (AF3) | <https://huggingface.co/IHP-Lab/AF3_PCLM_DPO> |
79
+
80
  ## Citation
81
 
82
  ```bibtex
 
90
 
91
  ## License
92
 
93
+ USC Research License (research / non-profit only). See [`LICENSE`](LICENSE).
 
94
 
95
  The base model (`Qwen/Qwen2-Audio-7B-Instruct`) carries its own Tongyi Qianwen license terms,
96
  which continue to apply to the inherited weights.