Bo Li's picture

Bo Li PRO

luodian

·

https://brianboli.com/

luodian

AI & ML interests

None yet

Recent Activity

updated a model 4 days ago

luodian/ov2-models

published a model 4 days ago

luodian/ov2-models

upvoted a paper about 1 month ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

View all activity

Organizations

upvoted a paper about 1 month ago

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Paper • 2604.04901 • Published Apr 6 • 40

upvoted a paper about 2 months ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published Mar 17 • 371

upvoted 3 papers 2 months ago

ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors

Paper • 2603.04338 • Published Mar 4 • 24

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published Mar 3 • 87

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 523

upvoted 3 papers 3 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Paper • 2602.08439 • Published Feb 9 • 28

Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

Paper • 2601.20833 • Published Jan 28 • 183

upvoted a collection 4 months ago

OneVision-Encoder

HEVC-Style Vision Transformer • 2 items • Updated Feb 10 • 3

upvoted 3 papers 5 months ago

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published Dec 22, 2025 • 68

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 188

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 96

upvoted 5 papers 6 months ago

Scaling Spatial Intelligence with Multimodal Foundation Models

Paper • 2511.13719 • Published Nov 17, 2025 • 49

Simulating the Visual World with Artificial Intelligence: A Roadmap

Paper • 2511.08585 • Published Nov 11, 2025 • 30

PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

Paper • 2511.13648 • Published Nov 17, 2025 • 53

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 132

The Quest for Generalizable Motion Generation: Data, Model, and Evaluation

Paper • 2510.26794 • Published Oct 30, 2025 • 27

upvoted 2 papers 7 months ago

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Paper • 2509.23661 • Published Sep 28, 2025 • 49

Visual Jigsaw Post-Training Improves MLLMs

Paper • 2509.25190 • Published Sep 29, 2025 • 37

upvoted a collection 8 months ago

LLaVA-OneVision-1.5

https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-1.5 • 9 items • Updated Oct 21, 2025 • 19