CCCCyx commited on
Commit
1f81311
Β·
verified Β·
1 Parent(s): df57422

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -7
README.md CHANGED
@@ -290,16 +290,14 @@ texts = [item["text"] for item in result["results"]]
290
 
291
  ## 🚧 Limitations and Future Work
292
 
293
- MOSS-VL-Base-0408 is a pretrained base checkpoint intended primarily as a foundation model, and several release items are still being finalized:
294
 
295
- - realtime usage is not documented here
296
- - benchmark, metric, and training details are still blank
297
- - some sections are intentionally placeholders until release information is finalized
298
- - batch calls currently require shared `generate_kwargs` and shared `media_kwargs` within one call
299
- - batch streaming and batch cancel / stop protocol are not part of `offline_batch_generate(...)`
300
 
301
  > [!NOTE]
302
- > We expect future releases to expand public evaluation coverage and provide stronger downstream aligned variants built on top of this base checkpoint.
303
 
304
  ## πŸ“œ Citation
305
  ```bibtex
 
290
 
291
  ## 🚧 Limitations and Future Work
292
 
293
+ MOSS-VL-Base-0408 is a pretrained base checkpoint, and we are actively improving several core capabilities for future iterations:
294
 
295
+ - πŸ“„ **Stronger OCR, Especially for Long Documents** β€” We plan to further improve text recognition, document parsing, and long-document understanding, with a particular focus on maintaining accuracy and consistency over lengthy structured inputs.
296
+ - 🎬 **Expanded Long-Video Understanding** β€” We aim to extend the model's ability on long-form video comprehension, including stronger temporal reasoning, better event tracking across extended durations, and more robust long-context video understanding.
297
+ - 🌍 **Richer World Knowledge** β€” We will continue to enhance the model's general world knowledge so it can provide better grounded multimodal understanding and stronger performance on knowledge-intensive visual-language tasks.
 
 
298
 
299
  > [!NOTE]
300
+ > We expect future releases to continue strengthening the base model itself while also enabling stronger downstream aligned variants built on top of it.
301
 
302
  ## πŸ“œ Citation
303
  ```bibtex