P6ztvyi6wg

p6ztvyi6wg

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 14 hours ago

Principled Analysis of Deep Reinforcement Learning Evaluation and Design Paradigms

liked a model 5 days ago

EdwardoSunny/ttt-discover-v3-qwen3-8b-ac2-s2-step30

upvoted a paper 10 days ago

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper about 14 hours ago

Principled Analysis of Deep Reinforcement Learning Evaluation and Design Paradigms

Paper • 2607.07769 • Published 9 days ago • 5

upvoted a paper 10 days ago

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

Paper • 2606.29526 • Published 19 days ago • 164

upvoted a paper 22 days ago

MemSlides: A Hierarchical Memory Driven Agent Framework for Personalized Slide Generation with Multi-turn Local Revision

Paper • 2606.17162 • Published Jun 15 • 177

upvoted a paper 25 days ago

Looped World Models

Paper • 2606.18208 • Published about 1 month ago • 480

upvoted a paper about 1 month ago

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

Paper • 2605.26244 • Published May 25 • 38

upvoted 3 papers about 2 months ago

FashionLens: Toward Versatile Fashion Image Retrieval via Task-Adaptive Learning

Paper • 2605.22552 • Published May 21 • 2

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published May 20 • 207

VideoSeeker: Incentivizing Instance-level Video Understanding via Native Agentic Tool Invocation

Paper • 2605.16079 • Published May 15 • 29

upvoted 2 papers 2 months ago

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

Paper • 2605.12882 • Published May 13 • 274

StableI2I: Spotting Unintended Changes in Image-to-Image Transition

Paper • 2605.04453 • Published May 6 • 11

upvoted 8 papers 3 months ago

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published Apr 28 • 288

ClawNet: Human-Symbiotic Agent Network for Cross-User Autonomous Cooperation

Paper • 2604.19211 • Published Apr 21 • 11

Semantic Richness or Geometric Reasoning? The Fragility of VLM's Visual Invariance

Paper • 2604.01848 • Published Apr 3 • 5

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 248

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 265

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 509

When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

Paper • 2604.08546 • Published Apr 9 • 116

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 639

upvoted 2 papers 4 months ago

LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

Paper • 2603.23607 • Published Mar 24 • 21

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 353