2 23 14

hexianyi

pkuhexianyi

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Latent Spatial Memory for Video World Models

authored a paper 28 days ago

ImgEdit: A Unified Image Editing Dataset and Benchmark

authored a paper 28 days ago

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

View all activity

Organizations

None yet

upvoted a paper 16 days ago

Latent Spatial Memory for Video World Models

Paper • 2606.09828 • Published 18 days ago • 69

authored 4 papers 28 days ago

ImgEdit: A Unified Image Editing Dataset and Benchmark

Paper • 2505.20275 • Published May 26, 2025 • 20

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 59

FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation

Paper • 2509.25187 • Published Sep 29, 2025 • 3

OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning

Paper • 2605.28691 • Published 30 days ago • 24

upvoted 2 papers 29 days ago

FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation

Paper • 2509.25187 • Published Sep 29, 2025 • 3

OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning

Paper • 2605.28691 • Published 30 days ago • 24

upvoted a paper about 2 months ago

HumanNet: Scaling Human-centric Video Learning to One Million Hours

Paper • 2605.06747 • Published May 7 • 55

upvoted an article 3 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 909

upvoted a paper 3 months ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 352

upvoted a paper 4 months ago

Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published Mar 4 • 189

upvoted a collection 4 months ago

Qwen3.5

Collection

21 items • Updated Mar 9 • 1.69k

upvoted 2 papers 5 months ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published Jan 23 • 34

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 224

upvoted a paper 6 months ago

UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement

Paper • 2512.21185 • Published Dec 24, 2025 • 32

upvoted a collection 8 months ago

Emu3.5

Collection

Native Multimodal Models are World Learners 🌍 • 4 items • Updated Feb 4 • 77

New activity in BestWishYsh/OpenS2V-5M 10 months ago

Fix background image paths

#5 opened 10 months ago by

pkuhexianyi

liked a dataset 11 months ago

BestWishYsh/OpenS2V-5M

Updated Jan 6 • 25.2k • 24

upvoted a paper about 1 year ago

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 59

authored a paper about 1 year ago

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Paper • 2505.20292 • Published May 26, 2025 • 52

hexianyi

AI & ML interests

Recent Activity

Organizations

pkuhexianyi's activity

Welcome Gemma 4: Frontier multimodal intelligence on device

Fix background image paths