AgMMU

non-profit

https://agmmu.github.io/

AI & ML interests

Multimodal Model Evaluation

Recent Activity

yunzeman authored a paper about 1 month ago

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

yunzeman authored a paper about 1 month ago

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

yunzeman authored a paper about 1 month ago

PaintScene4D: Consistent 4D Scene Generation from Text Prompts

View all activity

authored 10 papers about 1 month ago

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

Paper • 2409.03757 • Published Sep 5, 2024 • 3

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

Paper • 2412.01827 • Published Dec 2, 2024

PaintScene4D: Consistent 4D Scene Generation from Text Prompts

Paper • 2412.04471 • Published Dec 5, 2024

AgMMU: A Comprehensive Agricultural Multimodal Understanding and Reasoning Benchmark

Paper • 2504.10568 • Published Apr 14, 2025

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Paper • 2505.23766 • Published May 29, 2025

PPTArena: A Benchmark for Agentic PowerPoint Editing

Paper • 2512.03042 • Published Dec 2, 2025 • 1

LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight

Paper • 2511.20648 • Published Nov 25, 2025 • 1

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 56

OSGym: Scalable Distributed Data Engine for Generalizable Computer Agents

Paper • 2511.11672 • Published Nov 11, 2025

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published May 26 • 145

ipi8

updated a dataset 12 months ago

AgMMU/AgMMU_v1

Viewer • Updated Jul 29, 2025 • 50.2k • 501 • 2

updated a dataset about 1 year ago

AgMMU/AgMMU_v1

Viewer • Updated Jul 29, 2025 • 50.2k • 501 • 2

authored 2 papers about 1 year ago

MR. Video: "MapReduce" is the Principle for Long Video Understanding

Paper • 2504.16082 • Published Apr 22, 2025 • 5

Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception

Paper • 2504.11457 • Published Apr 15, 2025 • 1

updated a dataset about 1 year ago

AgMMU/AgMMU_v1

Viewer • Updated Jul 29, 2025 • 50.2k • 501 • 2

in AgMMU/AgMMU_v1 over 1 year ago

Create README.md

#3 opened over 1 year ago by

ipi8

published a dataset over 1 year ago

AgMMU/AgMMU_v1

Viewer • Updated Jul 29, 2025 • 50.2k • 501 • 2

updated a Space over 1 year ago

README

published a Space over 1 year ago

README

authored a paper over 1 year ago

Frozen Transformers in Language Models Are Effective Visual Encoder Layers

Paper • 2310.12973 • Published Oct 19, 2023 • 1