LMMs-Lab-Encoder

community

EvolvingLMMs-Lab

AI & ML interests

None defined yet.

Recent Activity

xiangan updated a model 2 days ago

lmms-lab-encoder/onevision-encoder-large-lang-tf57

xiangan updated a model 2 days ago

lmms-lab-encoder/onevision-encoder-large-tf57

xiangan updated a collection 3 days ago

onevision-encoder

View all activity

updated 2 models 2 days ago

lmms-lab-encoder/onevision-encoder-large-lang-tf57

Updated 2 days ago • 32

lmms-lab-encoder/onevision-encoder-large-tf57

0.3B • Updated 2 days ago • 33

updated a collection 3 days ago

onevision-encoder

4 items • Updated 3 days ago • 6

published 2 models 3 days ago

lmms-lab-encoder/onevision-encoder-large-lang-tf57

Updated 2 days ago • 32

lmms-lab-encoder/onevision-encoder-large-tf57

0.3B • Updated 2 days ago • 33

authored a paper 5 days ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published 10 days ago • 87

updated a dataset 2 months ago

lmms-lab-encoder/Molmo2-VideoPointEval

Updated Mar 4 • 9

published a dataset 2 months ago

lmms-lab-encoder/Molmo2-VideoPointEval

Updated Mar 4 • 9

submitted a paper to Daily Papers 2 months ago

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 87

updated a dataset 3 months ago

lmms-lab-encoder/60s_tem_grounding_ov2_codec_100k

Updated Feb 20 • 480

published 2 datasets 3 months ago

lmms-lab-encoder/60s_tem_grounding_ov2_codec_100k

Updated Feb 20 • 480

lmms-lab-encoder/60s_20260215_154644_ov2_codec_1w

Updated Feb 20 • 1

authored a paper 3 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

submitted a paper to Daily Papers 3 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

updated a dataset 3 months ago

lmms-lab-encoder/wd_temporal_grounding_frames_max_64_max_448x448_pixels_with_fps

Updated Feb 14 • 374

published a dataset 3 months ago

lmms-lab-encoder/wd_temporal_grounding_frames_max_64_max_448x448_pixels_with_fps

Updated Feb 14 • 374

authored 2 papers 3 months ago

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Paper • 2510.18795 • Published Oct 21, 2025 • 11

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 36