TianchengGu's picture

TianchengGu

TianchengGu

·

GaryGuTC

AI & ML interests

None yet

Recent Activity

upvoted a paper 27 days ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

upvoted a paper about 2 months ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

upvoted a paper 2 months ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

View all activity

Organizations

None yet

upvoted a paper 27 days ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published 29 days ago • 146

upvoted a paper about 2 months ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published Apr 30 • 92

upvoted a paper 2 months ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

Paper • 2604.14967 • Published Apr 16 • 15

upvoted a paper 3 months ago

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published Mar 16 • 187

upvoted 2 papers 4 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

Visual Para-Thinker: Divide-and-Conquer Reasoning for Visual Comprehension

Paper • 2602.13310 • Published Feb 10 • 9

upvoted 2 papers 5 months ago

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 275

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Paper • 2602.02361 • Published Feb 2 • 61

liked a dataset 5 months ago

DeepGlint-AI/DanQing100M

Viewer • Updated Mar 25 • 99.9M • 1.1k • 51

authored a paper 5 months ago

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 37

upvoted 5 papers 5 months ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 196

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published Jan 15 • 35

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 37

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 128

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

Paper • 2601.04720 • Published Jan 8 • 59

upvoted 5 papers 7 months ago

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Paper • 2512.04678 • Published Dec 4, 2025 • 42

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 188

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 107

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 96

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published Nov 12, 2025 • 218