Update README.md
Browse files
README.md
CHANGED
|
@@ -207,9 +207,9 @@ for t in range(31):
|
|
| 207 |
## Limitations
|
| 208 |
|
| 209 |
- This model is finetuned on LiveCC-7B-Base, which is starting from Qwen2-VL-7B-Base, so it may have limitations mentioned in https://huggingface.co/Qwen/Qwen2-VL-7B.
|
| 210 |
-
-
|
| 211 |
-
-
|
| 212 |
-
|
| 213 |
These limitations serve as ongoing directions for model optimization and improvement, and we are committed to continually enhancing the model's performance and scope of application.
|
| 214 |
|
| 215 |
## Citation
|
|
|
|
| 207 |
## Limitations
|
| 208 |
|
| 209 |
- This model is finetuned on LiveCC-7B-Base, which is starting from Qwen2-VL-7B-Base, so it may have limitations mentioned in https://huggingface.co/Qwen/Qwen2-VL-7B.
|
| 210 |
+
- When performing real-time video commentary, it may appear collapse --- e.g., repeat pattern. If you encounter this situation, try to adjust repetition_penalty, streaming_eos_base_threshold, and streaming_eos_threshold_step.
|
| 211 |
+
- This model only has a context window of 32768. Using more visual tokens per frame (e.g. 768 * 28 * 28) will have the best performance, but will shorten the working duration.
|
| 212 |
+
|
| 213 |
These limitations serve as ongoing directions for model optimization and improvement, and we are committed to continually enhancing the model's performance and scope of application.
|
| 214 |
|
| 215 |
## Citation
|