39 14

Sally

CArriy

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

MultiBind: A Benchmark for Attribute Misbinding in Multi-Subject Generation

upvoted a paper 1 day ago

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

upvoted a paper 1 day ago

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

View all activity

Organizations

None yet

upvoted 4 papers 1 day ago

MultiBind: A Benchmark for Attribute Misbinding in Multi-Subject Generation

Paper • 2603.21937 • Published 3 days ago • 6

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

Paper • 2603.22281 • Published 3 days ago • 12

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

Paper • 2603.12254 • Published 14 days ago • 20

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published 3 days ago • 117

upvoted a paper about 1 month ago

UI-Venus-1.5 Technical Report

Paper • 2602.09082 • Published Feb 9 • 156

upvoted 3 papers 2 months ago

Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published Jan 12 • 116

KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions

Paper • 2601.04745 • Published Jan 8 • 59

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

Paper • 2601.06789 • Published Jan 11 • 80

upvoted 12 papers 4 months ago

Kwai Keye-VL 1.5 Technical Report

Paper • 2509.01563 • Published Sep 1, 2025 • 38

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

Paper • 2509.09680 • Published Sep 11, 2025 • 44

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Paper • 2509.23661 • Published Sep 28, 2025 • 49

EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs

Paper • 2509.09174 • Published Sep 11, 2025 • 62

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22, 2025 • 117

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 119

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9, 2025 • 127

Sally

AI & ML interests

Recent Activity

Organizations

CArriy's activity