💼 Hiring

7 7 5

Xuanke Shi

shixuanke

https://xuankeshi.github.io/

AI & ML interests

Computer Vision, Computer Graphics, Machine Learning, Deep Learning

Recent Activity

updated a dataset 8 days ago

sensenova/ConsistCompose3M

updated a model 8 days ago

sensenova/ConsistCompose-BAGEL-7B-MoT

commentedon a paper about 1 month ago

Vision Bridge Transformer at Scale

View all activity

Organizations

updated a dataset 8 days ago

sensenova/ConsistCompose3M

Viewer • Updated 8 days ago • 11.3M • 152 • 8

updated a model 8 days ago

sensenova/ConsistCompose-BAGEL-7B-MoT

15B • Updated 8 days ago • 28 • 7

commented a paper about 1 month ago

Vision Bridge Transformer at Scale

Paper • 2511.23199 • Published Nov 28, 2025 • 47 •

authored a paper about 1 month ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

commented a paper about 1 month ago

OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation

Paper • 2605.21343 • Published May 20 • 8 •

upvoted a paper about 1 month ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

upvoted a paper about 2 months ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71

upvoted a collection 2 months ago

SenseNova-U1

Collection

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 10 items • Updated 14 days ago • 74

published a model 2 months ago

sensenova/SenseNova-U1-8B-MoT

Any-to-Any • 18B • Updated May 15 • 42.2k • 287

liked a dataset 2 months ago

shixuanke/ConsistCompose3M

Viewer • Updated Feb 25 • 11.3M • 361 • 3

commented a paper 2 months ago

MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings

Paper • 2604.19902 • Published Apr 21 • 3 •

liked a Space 2 months ago

RefineAnything

🖼

Refine selected image area with a text prompt

upvoted a paper 3 months ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 107

upvoted an article 3 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 167

liked a dataset 4 months ago

sensenova/ConsistCompose3M

Viewer • Updated 8 days ago • 11.3M • 152 • 8

authored 2 papers 4 months ago

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study

Paper • 2508.13142 • Published Aug 18, 2025 • 35

ConsistCompose: Unified Multimodal Layout Control for Image Composition

Paper • 2511.18333 • Published Nov 23, 2025 • 5

liked a model 4 months ago

sensenova/ConsistCompose-BAGEL-7B-MoT

15B • Updated 8 days ago • 28 • 7

published a model 4 months ago

sensenova/ConsistCompose-BAGEL-7B-MoT

15B • Updated 8 days ago • 28 • 7

updated a collection 4 months ago

MLLM_tools

Collection

1 item • Updated Feb 25

Xuanke Shi

AI & ML interests

Recent Activity

Organizations

shixuanke's activity

RefineAnything

NEO-unify: Building Native Multimodal Unified Models End to End