Safetensors
English
qwen2_vl
qwen_vl
video
real-time
multimodal
LLM
chenjoya commited on
Commit
3c5ce67
·
verified ·
1 Parent(s): 108f9a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -206,7 +206,7 @@ for t in range(31):
206
 
207
  ## Limitations
208
 
209
- - This model is starting from Qwen2-VL-7B-Base, so it may have limitations mentioned in https://huggingface.co/Qwen/Qwen2-VL-7B.
210
  - This model is trained only with streaming frame-words paradigm, thus it may be only capable for real-time video commentary.
211
  - The training ASR data is from YouTube CC, which has well-known low quality, so its formatting is not good (e.g. cannot output punctuation).
212
 
 
206
 
207
  ## Limitations
208
 
209
+ - This model is finetuned on LiveCC-7B-Base, which is starting from Qwen2-VL-7B-Base, so it may have limitations mentioned in https://huggingface.co/Qwen/Qwen2-VL-7B.
210
  - This model is trained only with streaming frame-words paradigm, thus it may be only capable for real-time video commentary.
211
  - The training ASR data is from YouTube CC, which has well-known low quality, so its formatting is not good (e.g. cannot output punctuation).
212