Safetensors
qwen2_5_omni
shuaijiang commited on
Commit
40da49a
·
verified ·
1 Parent(s): 1b94300

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -7,6 +7,8 @@ base_model:
7
  ---
8
 
9
  # Ke-Omni-R: Achieving Advanced Audio Reasoning with a Concise 50-Words Think Process
 
 
10
 
11
  Ke-Omni-R is an advanced audio reasoning model built upon [Qwen2.5-Omni-7B](https://github.com/QwenLM/Qwen2.5-Omni). With only 10k post-training samples, Ke-Omni-R has achieved state-of-the-art performance on the MMAU *Test-mini* and *Test* benchmarks. Key insights from its development include:
12
 
@@ -15,9 +17,6 @@ Ke-Omni-R is an advanced audio reasoning model built upon [Qwen2.5-Omni-7B](http
15
  - **KL Divergence**: Slight improvements were observed during GRPO training by leveraging KL divergence.
16
  - **Domain Ratio vs. Data Volume**: Domain diversity outweighs data volume. We utilized only 10k samples, with 5k randomly selected from AVQA and another 5k from MusicBench.
17
 
18
- If you wish to train or perform inference with the model, please visit the GitHub repository: [https://github.com/shuaijiang/Ke-Omni-R/](https://github.com/shuaijiang/Ke-Omni-R/).
19
- If you find this model helpful, please like this model and star our GitHub.
20
-
21
  ## Performance: Accuracies (%) on MMAU Test-mini and Test benchmark
22
  | Model | Method | Sound (Test-mini) | Sound (Test) | Music (Test-mini) | Music (Test) | Speech (Test-mini) | Speech (Test) | Average (Test-mini) | Average (Test) |
23
  |---------------------------------------|-----------------------|-----------|-------|-----------|-------|-----------|------|------------|-------|
 
7
  ---
8
 
9
  # Ke-Omni-R: Achieving Advanced Audio Reasoning with a Concise 50-Words Think Process
10
+ If you wish to train or perform inference with the model, please visit the GitHub repository: [https://github.com/shuaijiang/Ke-Omni-R/](https://github.com/shuaijiang/Ke-Omni-R/).
11
+ If you find this model helpful, please like this model and star our GitHub.
12
 
13
  Ke-Omni-R is an advanced audio reasoning model built upon [Qwen2.5-Omni-7B](https://github.com/QwenLM/Qwen2.5-Omni). With only 10k post-training samples, Ke-Omni-R has achieved state-of-the-art performance on the MMAU *Test-mini* and *Test* benchmarks. Key insights from its development include:
14
 
 
17
  - **KL Divergence**: Slight improvements were observed during GRPO training by leveraging KL divergence.
18
  - **Domain Ratio vs. Data Volume**: Domain diversity outweighs data volume. We utilized only 10k samples, with 5k randomly selected from AVQA and another 5k from MusicBench.
19
 
 
 
 
20
  ## Performance: Accuracies (%) on MMAU Test-mini and Test benchmark
21
  | Model | Method | Sound (Test-mini) | Sound (Test) | Music (Test-mini) | Music (Test) | Speech (Test-mini) | Speech (Test) | Average (Test-mini) | Average (Test) |
22
  |---------------------------------------|-----------------------|-----------|-------|-----------|-------|-----------|------|------------|-------|