JiachengPang commited on
Commit
37b7e1f
·
verified ·
1 Parent(s): 25141b5

License file

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  license: other
3
  license_name: usc-research
4
- license_link: https://github.com/ihp-lab/VoxParadox/blob/main/LICENSE
5
  language:
6
  - en
7
  base_model: nvidia/audio-flamingo-3
@@ -18,6 +18,14 @@ pipeline_tag: audio-text-to-text
18
 
19
  # Audio Flamingo 3 + PCLM + DPO
20
 
 
 
 
 
 
 
 
 
21
  PCLM- and DPO-finetuned [Audio Flamingo 3](https://huggingface.co/nvidia/audio-flamingo-3) from
22
  *Do Audio LLMs Listen or Read? Analyzing and Mitigating Paralinguistic Failures with VoxParadox*
23
  (ICML 2026).
@@ -73,6 +81,16 @@ python eval.py --predictions runs/eval/af3_pclm_dpo/predictions.jsonl
73
  PCLM activation is read from this checkpoint's `config.json`
74
  (`expose_layers=[5, 15, 25, 30]`, `use_sound_pclm=true`).
75
 
 
 
 
 
 
 
 
 
 
 
76
  ## Citation
77
 
78
  ```bibtex
@@ -86,8 +104,7 @@ PCLM activation is read from this checkpoint's `config.json`
86
 
87
  ## License
88
 
89
- USC Research License (research / non-profit only). See the
90
- [license file](https://github.com/ihp-lab/VoxParadox/blob/main/LICENSE).
91
 
92
  The base model (`nvidia/audio-flamingo-3`) carries the NVIDIA non-commercial license
93
  terms, which continue to apply to the inherited weights.
 
1
  ---
2
  license: other
3
  license_name: usc-research
4
+ license_link: LICENSE
5
  language:
6
  - en
7
  base_model: nvidia/audio-flamingo-3
 
18
 
19
  # Audio Flamingo 3 + PCLM + DPO
20
 
21
+ [![ICML 2026](https://img.shields.io/badge/ICML-2026-1d4ed8.svg)](https://icml.cc/Conferences/2026)
22
+ [![Paper](https://img.shields.io/badge/Paper-arXiv-AD1C18.svg)](https://arxiv.org/abs/2605.27772)
23
+ [![Project Page](https://img.shields.io/badge/Project-Page-0EA5E9.svg)](https://voxparadox.github.io/)
24
+ [![Code](https://img.shields.io/badge/GitHub-ihp--lab%2FVoxParadox-181717.svg?logo=github)](https://github.com/ihp-lab/VoxParadox)
25
+ [![Dataset](https://img.shields.io/badge/🤗%20Dataset-IHP--Lab%2FVoxParadox-FFD21E.svg)](https://huggingface.co/datasets/IHP-Lab/VoxParadox)
26
+ [![Qwen2-Audio + PCLM + DPO](https://img.shields.io/badge/🤗%20Sibling%20model-Qwen2--Audio+PCLM+DPO-FFD21E.svg)](https://huggingface.co/IHP-Lab/Qwen2-Audio_PCLM_DPO)
27
+ [![License](https://img.shields.io/badge/License-USC%20Research-228B22.svg)](LICENSE)
28
+
29
  PCLM- and DPO-finetuned [Audio Flamingo 3](https://huggingface.co/nvidia/audio-flamingo-3) from
30
  *Do Audio LLMs Listen or Read? Analyzing and Mitigating Paralinguistic Failures with VoxParadox*
31
  (ICML 2026).
 
81
  PCLM activation is read from this checkpoint's `config.json`
82
  (`expose_layers=[5, 15, 25, 30]`, `use_sound_pclm=true`).
83
 
84
+ ## Project resources
85
+
86
+ | Resource | Link |
87
+ |---|---|
88
+ | Paper (arXiv) | <https://arxiv.org/abs/2605.27772> |
89
+ | Project page | <https://voxparadox.github.io/> |
90
+ | Code | <https://github.com/ihp-lab/VoxParadox> |
91
+ | Benchmark | <https://huggingface.co/datasets/IHP-Lab/VoxParadox> |
92
+ | Sibling model (Qwen2-Audio) | <https://huggingface.co/IHP-Lab/Qwen2-Audio_PCLM_DPO> |
93
+
94
  ## Citation
95
 
96
  ```bibtex
 
104
 
105
  ## License
106
 
107
+ USC Research License (research / non-profit only). See [`LICENSE`](LICENSE).
 
108
 
109
  The base model (`nvidia/audio-flamingo-3`) carries the NVIDIA non-commercial license
110
  terms, which continue to apply to the inherited weights.