InternVL-U

non-profit

https://github.com/OpenGVLab/InternVL-U

AI & ML interests

None defined yet.

Recent Activity

Rayment updated a dataset about 1 month ago

InternVL-U/ScaleEdit-12M

cuierfei authored a paper about 1 month ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Rayment authored a paper about 1 month ago

MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites

View all activity

updated a dataset about 1 month ago

InternVL-U/ScaleEdit-12M

Viewer • Updated Apr 7 • 12.4M • 11.5k • 17

authored a paper about 1 month ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published Mar 26 • 132

authored 3 papers about 1 month ago

MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites

Paper • 2510.12126 • Published Oct 14, 2025 • 2

ScaleEdit-12M: Scaling Open-Source Image Editing Data Generation via Multi-Agent Framework

Paper • 2603.20644 • Published Mar 21 • 5

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published Mar 26 • 132

published a dataset about 2 months ago

InternVL-U/ScaleEdit-12M

Viewer • Updated Apr 7 • 12.4M • 11.5k • 17

updated a model about 2 months ago

InternVL-U/InternVL-U

Any-to-Any • Updated Mar 15 • 95 • 56

in InternVL-U/InternVL-U about 2 months ago

Add pipeline tag, paper link, and sample usage

#1 opened about 2 months ago by

authored 2 papers about 2 months ago

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Paper • 2603.12264 • Published Mar 12 • 15

Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation

Paper • 2603.12247 • Published Mar 12 • 24

authored a paper about 2 months ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published Mar 10 • 48

authored a paper about 2 months ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published Mar 10 • 48

authored a paper about 2 months ago

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published Mar 10 • 48

updated a model about 2 months ago

InternVL-U/InternVL-U

Any-to-Any • Updated Mar 15 • 95 • 56

authored a paper 7 months ago

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Paper • 2510.11027 • Published Oct 13, 2025 • 23

authored 5 papers 7 months ago

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Paper • 2406.08418 • Published Jun 12, 2024 • 33

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 220

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17, 2025 • 21

Sequential Diffusion Language Models

Paper • 2509.24007 • Published Sep 28, 2025 • 47

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9, 2025 • 21