Saining Xie's picture

Saining Xie

sainx

·

sainingxie

AI & ML interests

None yet

Recent Activity

authored a paper 20 days ago

PaintBench: Deterministic Evaluation of Precise Visual Editing

upvoted a paper 21 days ago

Benchmarking Visual State Tracking in Multimodal Video Understanding

upvoted a paper 4 months ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

View all activity

Organizations

authored a paper 20 days ago

PaintBench: Deterministic Evaluation of Precise Visual Editing

Paper • 2606.00188 • Published 27 days ago • 3

upvoted a paper 21 days ago

Benchmarking Visual State Tracking in Multimodal Video Understanding

Paper • 2606.03920 • Published 23 days ago • 50

upvoted 2 papers 4 months ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 107

Solaris: Building a Multiplayer Video World Model in Minecraft

Paper • 2602.22208 • Published Feb 25 • 31

authored a paper 5 months ago

Self-Refining Video Sampling

Paper • 2601.18577 • Published Jan 26 • 25

upvoted a paper 5 months ago

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 55

authored a paper 5 months ago

Transition Matching Distillation for Fast Video Generation

Paper • 2601.09881 • Published Jan 14 • 34

upvoted a paper 7 months ago

Flow Map Distillation Without Data

Paper • 2511.19428 • Published Nov 24, 2025 • 6

authored 12 papers 7 months ago

Deeply-Supervised Nets

Paper • 1409.5185 • Published Sep 18, 2014

Holistically-Nested Edge Detection

Paper • 1504.06375 • Published Apr 24, 2015 • 1

Aggregated Residual Transformations for Deep Neural Networks

Paper • 1611.05431 • Published Nov 16, 2016 • 2

Demystifying CLIP Data

Paper • 2309.16671 • Published Sep 28, 2023 • 20

Sample-Efficient Neural Architecture Search by Learning Action Space

Paper • 1906.06832 • Published Jun 17, 2019

Momentum Contrast for Unsupervised Visual Representation Learning

Paper • 1911.05722 • Published Nov 13, 2019 • 2

Going Denser with Open-Vocabulary Part Segmentation

Paper • 2305.11173 • Published May 18, 2023 • 2

Image Sculpting: Precise Object Editing with 3D Geometry Control

Paper • 2401.01702 • Published Jan 2, 2024 • 20

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

Paper • 2401.06209 • Published Jan 11, 2024

Masked Autoencoders Are Scalable Vision Learners

Paper • 2111.06377 • Published Nov 11, 2021 • 6

V-IRL: Grounding Virtual Intelligence in Real Life

Paper • 2402.03310 • Published Feb 5, 2024 • 16

Masked Feature Prediction for Self-Supervised Visual Pre-Training

Paper • 2112.09133 • Published Dec 16, 2021