Zanlin Ni PRO

nzl-thu

·

nzl-thu

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Linearizing Vision Transformer with Test-Time Training

upvoted a paper about 1 month ago

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

upvoted a paper about 2 months ago

Linear-Time Global Visual Modeling without Explicit Attention

View all activity

Organizations

upvoted 2 papers about 1 month ago

Linearizing Vision Transformer with Test-Time Training

Paper • 2605.02772 • Published May 28 • 20

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

Paper • 2605.05781 • Published May 7 • 5

upvoted 2 papers about 2 months ago

Linear-Time Global Visual Modeling without Explicit Attention

Paper • 2605.01711 • Published May 3 • 7

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

Paper • 2605.14333 • Published May 14 • 35

upvoted a paper 2 months ago

Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models

Paper • 2604.25636 • Published Apr 28 • 24

upvoted a paper 4 months ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 110

upvoted a paper 5 months ago

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Paper • 2601.15165 • Published Jan 21 • 75

upvoted a collection 8 months ago

MemoryVLA

Checkpoints, data and logs of MemoryVLA & MemoryVLA+. https://github.com/shihao1895/MemoryVLA • 19 items • Updated Mar 2 • 9

upvoted a paper about 1 year ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 141

upvoted 2 papers over 1 year ago

CODA: Repurposing Continuous VAEs for Discrete Tokenization

Paper • 2503.17760 • Published Mar 22, 2025 • 4

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Paper • 2410.08261 • Published Oct 10, 2024 • 52