2 69 7

knightnemo

https://knightnemo.github.io

AI & ML interests

World Models, World Action Models, VLA Models, Test-time Adaptation & Self-Improvement, Dexterous Manipulation.

Recent Activity

updated a dataset 17 days ago

knightnemo/furniturebench_lamp_delta

published a dataset 17 days ago

knightnemo/furniturebench_lamp_delta

upvoted a paper 18 days ago

On the Geometry of On-Policy Distillation

View all activity

Organizations

upvoted a paper 18 days ago

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 22 days ago • 75

upvoted a collection 20 days ago

Cosmos3

Collection

Omnimodal World Models for Physical AI • 16 items • Updated 1 day ago • 132

upvoted a paper 26 days ago

SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer

Paper • 2605.30409 • Published about 1 month ago • 41

upvoted a paper 27 days ago

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Paper • 2605.30263 • Published about 1 month ago • 59

upvoted 3 papers about 1 month ago

upvoted a collection about 1 month ago

Cambrian-P Models

Collection

5 items • Updated May 21 • 1

upvoted a paper about 1 month ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published May 14 • 91

upvoted 3 papers about 2 months ago

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

Paper • 2605.08678 • Published May 9 • 9

STARFlow2: Bridging Language Models and Normalizing Flows for Unified Multimodal Generation

Paper • 2605.08029 • Published May 8 • 12

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355

upvoted a collection about 2 months ago

Nano-World-Model

Collection

🌍 A minimalist repository for training video world models based on diffusion-forcing. • 20 items • Updated May 17 • 7

upvoted 2 papers about 2 months ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

Paper • 2604.26694 • Published Apr 29 • 6

upvoted a paper 2 months ago

ELT: Elastic Looped Transformers for Visual Generation

Paper • 2604.09168 • Published Apr 10 • 24

upvoted 2 papers 3 months ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 158

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91

upvoted 2 papers 5 months ago

LoL: Longer than Longer, Scaling Video Generation to Hour

Paper • 2601.16914 • Published Jan 23 • 23

Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135

knightnemo

AI & ML interests

Recent Activity

Organizations

knightnemo's activity