Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -19,17 +19,13 @@ Welcome to OpenGVLab! We are a research group from Shanghai AI Lab focused on Vi
|
|
| 19 |
- [InternImage](https://github.com/OpenGVLab/InternImage): a large-scale vision foundation models with deformable convolutions.
|
| 20 |
- [InternVideo](https://github.com/OpenGVLab/InternVideo): large-scale video foundation models for multimodal understanding.
|
| 21 |
- [VideoChat](https://github.com/OpenGVLab/Ask-Anything): an end-to-end chat assistant for video comprehension.
|
| 22 |
-
- [All
|
| 23 |
-
- [All Seeing V2]():
|
| 24 |
-
-
|
| 25 |
-
|
| 26 |
|
| 27 |
# Datasets
|
| 28 |
|
| 29 |
-
- [ShareGPT4o]():
|
| 30 |
- [InternVid](https://github.com/OpenGVLab/InternVideo/tree/main/Data/InternVid): a large-scale video-text dataset for multimodal understanding and generation.
|
| 31 |
|
| 32 |
# Benchmarks
|
| 33 |
- [MVBench](https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat2): a comprehensive benchmark for multimodal video understanding.
|
| 34 |
-
|
| 35 |
-
|
|
|
|
| 19 |
- [InternImage](https://github.com/OpenGVLab/InternImage): a large-scale vision foundation models with deformable convolutions.
|
| 20 |
- [InternVideo](https://github.com/OpenGVLab/InternVideo): large-scale video foundation models for multimodal understanding.
|
| 21 |
- [VideoChat](https://github.com/OpenGVLab/Ask-Anything): an end-to-end chat assistant for video comprehension.
|
| 22 |
+
- [All-Seeing-V1](https://github.com/OpenGVLab/all-seeing): towards panoptic visual recognition and understanding of the open world.
|
| 23 |
+
- [All Seeing V2](https://github.com/OpenGVLab/all-seeing): towards general relation comprehension of the open world.
|
|
|
|
|
|
|
| 24 |
|
| 25 |
# Datasets
|
| 26 |
|
| 27 |
+
- [ShareGPT4o](https://sharegpt4o.github.io/): a groundbreaking large-scale resource that we plan to open-source with 200K meticulously annotated images, 10K videos with highly descriptive captions, and 10K audio files with detailed descriptions.
|
| 28 |
- [InternVid](https://github.com/OpenGVLab/InternVideo/tree/main/Data/InternVid): a large-scale video-text dataset for multimodal understanding and generation.
|
| 29 |
|
| 30 |
# Benchmarks
|
| 31 |
- [MVBench](https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat2): a comprehensive benchmark for multimodal video understanding.
|
|
|
|
|
|