2 9 9

Wensong Song

WensongSong

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

liked a Space 2 months ago

123123aa123/UniGeo

upvoted a paper 2 months ago

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

View all activity

Organizations

upvoted a paper about 2 months ago

UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

Paper • 2604.17565 • Published Apr 19 • 10

liked a Space 2 months ago

UniGeo

📈

Generate camera‑controlled edits from a single image

upvoted a paper 2 months ago

RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

Paper • 2604.06870 • Published Apr 8 • 44

liked a Space 2 months ago

RefineAnything

🖼

Refine selected image area with a text prompt

upvoted a paper 4 months ago

HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

Paper • 2603.02210 • Published Mar 2 • 30

liked a model 6 months ago

Qwen/Qwen-Image-Edit-2511

Image-to-Image • Updated Dec 23, 2025 • 184k • • 1.07k

liked a Space 8 months ago

Sora 2

📉

485

Generate videos from text or images

upvoted 6 papers 10 months ago

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9, 2025 • 84

Reconstruction Alignment Improves Unified Multimodal Models

Paper • 2509.07295 • Published Sep 8, 2025 • 40

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8, 2025 • 33

Staying in the Sweet Spot: Responsive Reasoning Evolution via Capability-Adaptive Hint Scaffolding

Paper • 2509.06923 • Published Sep 8, 2025 • 22

Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search

Paper • 2509.07969 • Published Sep 9, 2025 • 60

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 105

liked a model 12 months ago

black-forest-labs/FLUX.1-Kontext-dev

Image-to-Image • Updated Jan 1 • 138k • • 2.68k

liked a dataset about 1 year ago

WensongSong/AnyInsertion_V1

Viewer • Updated May 9, 2025 • 137k • 2.98k • 3

New activity in WensongSong/AnyInsertion about 1 year ago

Released text-prompt dataset

#2 opened about 1 year ago by

xing666

updated 2 datasets about 1 year ago

WensongSong/AnyInsertion_V1

Viewer • Updated May 9, 2025 • 137k • 2.98k • 3

WensongSong/AnyInsertion

Viewer • Updated May 9, 2025 • 59k • 640 • 9

published a dataset about 1 year ago

WensongSong/AnyInsertion_V1

Viewer • Updated May 9, 2025 • 137k • 2.98k • 3

updated a Space about 1 year ago

Insert Anything

🌍

Insert images into backgrounds using masks or text labels

Wensong Song

AI & ML interests

Recent Activity

Organizations

WensongSong's activity

UniGeo

RefineAnything

Sora 2

Released text-prompt dataset

Insert Anything