4 33 11

Zhiheng Liu

Johanan0528

Johanan528

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

upvoted a paper 27 days ago

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

upvoted a collection about 1 month ago

Qwen3

View all activity

Organizations

upvoted a paper 7 days ago

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

Paper • 2606.17539 • Published 9 days ago • 15

upvoted a paper 27 days ago

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14, 2025 • 74

upvoted a collection about 1 month ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.82k

upvoted a paper about 1 month ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published May 14 • 91

upvoted a paper about 2 months ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71

upvoted a paper 3 months ago

VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

Paper • 2603.26599 • Published Mar 27 • 67

upvoted a paper 4 months ago

Utonia: Toward One Encoder for All Point Clouds

Paper • 2603.03283 • Published Mar 3 • 186

updated a dataset 4 months ago

Johanan0528/nemotrov2_zhiheng

Viewer • Updated Feb 18 • 104 • 19

published a dataset 4 months ago

Johanan0528/nemotrov2_zhiheng

Viewer • Updated Feb 18 • 104 • 19

upvoted a paper 6 months ago

HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

Paper • 2512.21338 • Published Dec 24, 2025 • 23

upvoted 3 papers 7 months ago

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Paper • 2512.07802 • Published Dec 8, 2025 • 46

Scaling Zero-Shot Reference-to-Video Generation

Paper • 2512.06905 • Published Dec 7, 2025 • 29

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 78

upvoted 2 papers 8 months ago

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Paper • 2510.19871 • Published Oct 22, 2025 • 30

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published Oct 29, 2025 • 81

liked 2 models over 1 year ago

omni-research/Tarsier2-Recap-7b

8B • Updated Aug 11, 2025 • 53.6k • 38

Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • 73B • Updated Jun 6, 2025 • 518k • • 629

upvoted 2 papers over 1 year ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published Feb 14, 2025 • 57

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

liked a dataset over 1 year ago

yzwang/X2I-text-to-image

Updated Dec 14, 2024 • 409 • 8

Zhiheng Liu

AI & ML interests

Recent Activity

Organizations

Johanan0528's activity