kiiic commited on
Commit
ea7c621
·
verified ·
1 Parent(s): 2f1fcdb

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +1 -1
  3. assets/arc.png +3 -0
.gitattributes CHANGED
@@ -38,3 +38,4 @@ assets/moss-audio-2.png filter=lfs diff=lfs merge=lfs -text
38
  assets/moss-audio-image.png filter=lfs diff=lfs merge=lfs -text
39
  assets/moss-audio-logo.png filter=lfs diff=lfs merge=lfs -text
40
  assets/speech_caption_radar.png filter=lfs diff=lfs merge=lfs -text
 
 
38
  assets/moss-audio-image.png filter=lfs diff=lfs merge=lfs -text
39
  assets/moss-audio-logo.png filter=lfs diff=lfs merge=lfs -text
40
  assets/speech_caption_radar.png filter=lfs diff=lfs merge=lfs -text
41
+ assets/arc.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -85,7 +85,7 @@ Understanding audio requires more than simply transcribing words — it demands
85
  ## Model Architecture
86
 
87
  <p align="center">
88
- <img src="./assets/moss-audio-architecture.svg" width="95%" />
89
  </p>
90
 
91
  MOSS-Audio follows a modular design comprising three components: an audio encoder, a modality adapter, and a large language model. Raw audio is first encoded by **MOSS-Audio-Encoder** into continuous temporal representations at **12.5 Hz**, which are then projected into the language model's embedding space through the adapter and finally consumed by the LLM for auto-regressive text generation.
 
85
  ## Model Architecture
86
 
87
  <p align="center">
88
+ <img src="./assets/arc.png" width="95%" />
89
  </p>
90
 
91
  MOSS-Audio follows a modular design comprising three components: an audio encoder, a modality adapter, and a large language model. Raw audio is first encoded by **MOSS-Audio-Encoder** into continuous temporal representations at **12.5 Hz**, which are then projected into the language model's embedding space through the adapter and finally consumed by the LLM for auto-regressive text generation.
assets/arc.png ADDED

Git LFS Details

  • SHA256: 54ae0abb8514ccc886823d5564a7584e73a417ae5e208f9679895d5b1a198a0e
  • Pointer size: 131 Bytes
  • Size of remote file: 978 kB