darianb commited on
Commit
cda5bd0
·
verified ·
1 Parent(s): 15730ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -38,7 +38,7 @@ Key improvements include a custom, LFM based audio detokenizer, `llama.cpp` comp
38
 
39
  LFM2.5-Audio is an end-to-end multimodal speech and text language model, and as such does not require separate ASR and TTS components.
40
  Designed with low latency and real time conversation in mind, at only 1.5 billion parameters LFM2.5-Audio enables seamless conversational interaction, achieving capabilities on par with much larger models.
41
- Our model consists of a pretrained LFM2.5 model as its multimodal backbone, along with a FastConformer based audio encoder to handle continuous audio inputs, and an RQ-transformer generating discrete tokens coupled with a lighweight audio detokenizer for audio output.
42
 
43
  LFM2.5-Audio supports two distinct generation routines, each suitable for a set of tasks.
44
  Interleaved generation enables real-time speech-to-speech conversational chatbot capabilities, where audio generation latency is key.
 
38
 
39
  LFM2.5-Audio is an end-to-end multimodal speech and text language model, and as such does not require separate ASR and TTS components.
40
  Designed with low latency and real time conversation in mind, at only 1.5 billion parameters LFM2.5-Audio enables seamless conversational interaction, achieving capabilities on par with much larger models.
41
+ Our model consists of a pretrained LFM2.5 model as its multimodal backbone, along with a FastConformer based audio encoder to handle continuous audio inputs, and an RQ-transformer generating discrete tokens coupled with a lightweight audio detokenizer for audio output.
42
 
43
  LFM2.5-Audio supports two distinct generation routines, each suitable for a set of tasks.
44
  Interleaved generation enables real-time speech-to-speech conversational chatbot capabilities, where audio generation latency is key.