Xiaoxing Hu

wsdwJohn1231

1 17 2

https://xiaoxing2001.github.io

xiaoxing2001

AI & ML interests

None yet

Recent Activity

upvoted a paper 25 days ago

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

upvoted a paper 25 days ago

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

upvoted a paper about 1 month ago

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

View all activity

Organizations

upvoted 2 papers 25 days ago

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Paper • 2606.09304 • Published 26 days ago • 6

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

Paper • 2606.09290 • Published 26 days ago • 7

upvoted a paper about 1 month ago

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

Paper • 2606.02522 • Published Jun 1 • 13

upvoted a paper about 2 months ago

Anisotropic Modality Align

Paper • 2605.07825 • Published May 8 • 27

upvoted 2 papers 3 months ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

Paper • 2604.14967 • Published Apr 16 • 15

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published Apr 6 • 236

upvoted a paper 5 months ago

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Paper • 2602.07026 • Published Feb 2 • 140

upvoted a paper 6 months ago

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 37

upvoted a paper 8 months ago

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Paper • 2510.18795 • Published Oct 21, 2025 • 11

upvoted a paper 9 months ago

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9, 2025 • 110

upvoted a paper 10 months ago

Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval

Paper • 2509.09118 • Published Sep 11, 2025 • 8

upvoted 2 papers 11 months ago

ForCenNet: Foreground-Centric Network for Document Image Rectification

Paper • 2507.19804 • Published Jul 26, 2025 • 12

Region-based Cluster Discrimination for Visual Representation Learning

Paper • 2507.20025 • Published Jul 26, 2025 • 20

upvoted 3 papers about 1 year ago

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published Apr 24, 2025 • 41

Decoupled Global-Local Alignment for Improving Compositional Understanding

Paper • 2504.16801 • Published Apr 23, 2025 • 14

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published Apr 3, 2025 • 68

upvoted a paper over 1 year ago

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

Paper • 2502.12513 • Published Feb 18, 2025 • 16

Xiaoxing Hu

AI & ML interests

Recent Activity

Organizations

wsdwJohn1231's activity