Read 2026 - a andrei-saceleanu Collection

andrei-saceleanu 's Collections

updated May 14

Upvote

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 330
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Paper • 2512.23988 • Published Dec 30, 2025 • 19
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

Paper • 2512.25075 • Published Dec 31, 2025 • 16
Guiding a Diffusion Transformer with the Internal Dynamics of Itself

Paper • 2512.24176 • Published Dec 30, 2025 • 8
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 52
AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

Paper • 2601.00796 • Published Jan 2 • 32
Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning

Paper • 2512.24146 • Published Dec 30, 2025 • 14
Deep Delta Learning

Paper • 2601.00417 • Published Jan 1 • 34
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 183
SOP: A Scalable Online Post-Training System for Vision-Language-Action Models

Paper • 2601.03044 • Published Jan 6 • 28
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 234
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Paper • 2601.03425 • Published Jan 6 • 17
RelayLLM: Efficient Reasoning via Collaborative Decoding

Paper • 2601.05167 • Published Jan 8 • 31
AgentOCR: Reimagining Agent History via Optical Self-Compression

Paper • 2601.04786 • Published Jan 8 • 31
Over-Searching in Search-Augmented Large Language Models

Paper • 2601.05503 • Published Jan 9 • 7
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors

Paper • 2601.07226 • Published Jan 12 • 33
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models

Paper • 2601.07351 • Published Jan 12 • 26
Dr. Zero: Self-Evolving Search Agents without Training Data

Paper • 2601.07055 • Published Jan 11 • 22
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale

Paper • 2601.08225 • Published Jan 13 • 53
The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents

Paper • 2601.07264 • Published Jan 12 • 24
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices

Paper • 2601.08303 • Published Jan 13 • 21
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 56
Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments

Paper • 2601.01075 • Published Jan 3 • 6
The AI Hippocampus: How Far are We From Human Memory?

Paper • 2601.09113 • Published Jan 14 • 6
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 150
Alterbute: Editing Intrinsic Attributes of Objects in Images

Paper • 2601.10714 • Published Jan 15 • 31
Transition Matching Distillation for Fast Video Generation

Paper • 2601.09881 • Published Jan 14 • 34
Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text

Paper • 2601.10355 • Published Jan 15 • 39
Language of Thought Shapes Output Diversity in Large Language Models

Paper • 2601.11227 • Published Jan 16 • 10
More Images, More Problems? A Controlled Analysis of VLM Failure Modes

Paper • 2601.07812 • Published Jan 12 • 6
Toward Efficient Agents: Memory, Tool learning, and Planning

Paper • 2601.14192 • Published Jan 20 • 57
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 207
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

Paper • 2601.16208 • Published Jan 22 • 55
PROGRESSLM: Towards Progress Reasoning in Vision-Language Models

Paper • 2601.15224 • Published Jan 21 • 12
360Anything: Geometry-Free Lifting of Images and Videos to 360°

Paper • 2601.16192 • Published Jan 22 • 9
Agentic Uncertainty Quantification

Paper • 2601.15703 • Published Jan 22 • 9
MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

Paper • 2601.07251 • Published Jan 12 • 11
C-RADIOv4 (Tech Report)

Paper • 2601.17237 • Published Jan 24 • 11
Agentic Very Long Video Understanding

Paper • 2601.18157 • Published Jan 26 • 22
Shaping capabilities with token-level data filtering

Paper • 2601.21571 • Published Jan 29 • 29
KromHC: Manifold-Constrained Hyper-Connections with Kronecker-Product Residual Matrices

Paper • 2601.21579 • Published Jan 29 • 6
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 113
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Paper • 2601.21468 • Published Jan 29 • 25
LMK > CLS: Landmark Pooling for Dense Embeddings

Paper • 2601.21525 • Published Jan 29 • 5
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis

Paper • 2602.03139 • Published Feb 3 • 45
Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published Feb 4 • 38
Protein Autoregressive Modeling via Multiscale Structure Generation

Paper • 2602.04883 • Published Feb 4 • 3
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 33
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published Feb 9 • 290
Towards Agentic Intelligence for Materials Science

Paper • 2602.00169 • Published Jan 29 • 48
Reliable and Responsible Foundation Models: A Comprehensive Survey

Paper • 2602.08145 • Published Feb 4 • 8
Col-Bandit: Zero-Shot Query-Time Pruning for Late-Interaction Retrieval

Paper • 2602.02827 • Published Feb 2 • 3
Stable Velocity: A Variance Perspective on Flow Matching

Paper • 2602.05435 • Published Feb 5 • 3
The Pensieve Paradigm: Stateful Language Models Mastering Their Own Context

Paper • 2602.12108 • Published Feb 12 • 13
Free(): Learning to Forget in Malloc-Only Reasoning Models

Paper • 2602.08030 • Published Feb 8 • 6
When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models

Paper • 2602.10179 • Published Feb 10 • 6
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Paper • 2602.12617 • Published Feb 13 • 20
Experiential Reinforcement Learning

Paper • 2602.13949 • Published Feb 15 • 76
SPILLage: Agentic Oversharing on the Web

Paper • 2602.13516 • Published Feb 13
Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks

Paper • 2602.14689 • Published Feb 16 • 1
On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

Paper • 2602.15322 • Published Feb 17 • 11
Visual Persuasion: What Influences Decisions of Vision-Language Models?

Paper • 2602.15278 • Published Feb 17 • 3
The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems

Paper • 2602.15382 • Published Feb 17 • 5
Causal-JEPA: Learning World Models through Object-Level Latent Interventions

Paper • 2602.11389 • Published Feb 11 • 11
Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

Paper • 2602.14080 • Published Feb 15 • 23
Multi-agent cooperation through in-context co-player inference

Paper • 2602.16301 • Published Feb 18 • 24
Reinforced Fast Weights with Next-Sequence Prediction

Paper • 2602.16704 • Published Feb 18 • 14
World Action Models are Zero-shot Policies

Paper • 2602.15922 • Published Feb 17 • 19
Unified Latents (UL): How to train your latents

Paper • 2602.17270 • Published Feb 19 • 62
DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

Paper • 2602.16968 • Published Feb 19 • 13
NeST: Neuron Selective Tuning for LLM Safety

Paper • 2602.16835 • Published Feb 18 • 1
CrispEdit: Low-Curvature Projections for Scalable Non-Destructive LLM Editing

Paper • 2602.15823 • Published Feb 17 • 3
Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 266
Spanning the Visual Analogy Space with a Weight Basis of LoRAs

Paper • 2602.15727 • Published Feb 17 • 13
VLANeXt: Recipes for Building Strong VLA Models

Paper • 2602.18532 • Published Feb 20 • 52
On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published Feb 24 • 103
Test-Time Training with KV Binding Is Secretly Linear Attention

Paper • 2602.21204 • Published Feb 24 • 32
One-step Language Modeling via Continuous Denoising

Paper • 2602.16813 • Published Feb 18 • 4
Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

Paper • 2602.21196 • Published Feb 24 • 7
The Diffusion Duality, Chapter II: Ψ-Samplers and Efficient Curriculum

Paper • 2602.21185 • Published Feb 24 • 4
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

Paper • 2602.21548 • Published Feb 25 • 55
Image Generation with a Sphere Encoder

Paper • 2602.15030 • Published Feb 16 • 20
From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

Paper • 2602.21778 • Published Feb 25 • 15
SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models

Paper • 2602.18993 • Published Feb 22 • 4
Dropping Anchor and Spherical Harmonics for Sparse-view Gaussian Splatting

Paper • 2602.20933 • Published Feb 24 • 4
VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale

Paper • 2602.23361 • Published Feb 26 • 17
Causal Motion Diffusion Models for Autoregressive Motion Generation

Paper • 2602.22594 • Published Feb 26 • 7
AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

Paper • 2602.23258 • Published Feb 26 • 28
Mode Seeking meets Mean Seeking for Fast Long Video Generation

Paper • 2602.24289 • Published Feb 27 • 41
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

Paper • 2602.23881 • Published Feb 27 • 18
How to Take a Memorable Picture? Empowering Users with Actionable Feedback

Paper • 2602.21877 • Published Feb 25 • 18
Spectral Condition for μP under Width-Depth Scaling

Paper • 2603.00541 • Published Feb 28 • 15
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 107
Spilled Energy in Large Language Models

Paper • 2602.18671 • Published Feb 21 • 12
Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published Mar 4 • 190
RealWonder: Real-Time Physical Action-Conditioned Video Generation

Paper • 2603.05449 • Published Mar 5 • 12
KARL: Knowledge Agents via Reinforcement Learning

Paper • 2603.05218 • Published Mar 5 • 7
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

Paper • 2603.04553 • Published Mar 4 • 3
Progressive Residual Warmup for Language Model Pretraining

Paper • 2603.05369 • Published Mar 5 • 36
Dynamic Model Routing and Cascading for Efficient LLM Inference: A Survey

Paper • 2603.04445 • Published Apr 21 • 5
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Paper • 2603.09206 • Published Mar 10 • 54
Lost in Backpropagation: The LM Head is a Gradient Bottleneck

Paper • 2603.10145 • Published Mar 10 • 13
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers

Paper • 2603.10744 • Published Mar 11 • 7
Hindsight Credit Assignment for Long-Horizon LLM Agents

Paper • 2603.08754 • Published Mar 7 • 5
HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios

Paper • 2603.11975 • Published Mar 12 • 12
Grounding World Simulation Models in a Real-World Metropolis

Paper • 2603.15583 • Published Mar 16 • 155
Attention Residuals

Paper • 2603.15031 • Published Mar 16 • 189
WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation

Paper • 2603.15132 • Published Mar 16 • 35
LoST: Level of Semantics Tokenization for 3D Shapes

Paper • 2603.17995 • Published Mar 18 • 32
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning

Paper • 2603.14482 • Published Mar 15 • 36
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents

Paper • 2603.18815 • Published Mar 19 • 14
COT-FM: Cluster-wise Optimal Transport Flow Matching

Paper • 2603.13395 • Published Mar 11 • 2
Matryoshka Gaussian Splatting

Paper • 2603.19234 • Published Mar 19 • 12
Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

Paper • 2603.25745 • Published Mar 26 • 16
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

Paper • 2603.25702 • Published Mar 26 • 8
Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

Paper • 2603.24844 • Published Mar 25 • 10
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning

Paper • 2603.23483 • Published Mar 24 • 62
From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Paper • 2603.22386 • Published Mar 23 • 57
Agentic AI and the next intelligence explosion

Paper • 2603.20639 • Published Mar 21 • 11
Hyperagents

Paper • 2603.19461 • Published Mar 19 • 51
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

Paper • 2603.20155 • Published Mar 20 • 10
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published Apr 9 • 295
DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published Apr 9 • 54
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

Paper • 2604.08224 • Published Apr 9 • 53
Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Paper • 2604.05333 • Published Apr 7 • 23
The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment

Paper • 2604.06377 • Published Apr 7 • 7
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

Paper • 2604.04247 • Published Apr 5 • 31
Neural Computers

Paper • 2604.06425 • Published Apr 7 • 32
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

Paper • 2604.07340 • Published Apr 8 • 18
Learning to Hint for Reinforcement Learning

Paper • 2604.00698 • Published Apr 1 • 9
Learning to Retrieve from Agent Trajectories

Paper • 2604.04949 • Published Mar 30 • 72
Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 509
Memory Intelligence Agent

Paper • 2604.04503 • Published Apr 6 • 58
FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Paper • 2604.04901 • Published Apr 6 • 40
Scaling Teams or Scaling Time? Memory Enabled Lifelong Learning in LLM Multi-Agent Systems

Paper • 2604.03295 • Published Mar 27 • 10
HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems

Paper • 2604.04522 • Published Apr 6 • 10
Token Warping Helps MLLMs Look from Nearby Viewpoints

Paper • 2604.02870 • Published Apr 3 • 34
Do World Action Models Generalize Better than VLAs? A Robustness Study

Paper • 2603.22078 • Published Apr 1 • 7
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published Apr 2 • 152
Steerable Visual Representations

Paper • 2604.02327 • Published Apr 2 • 56
Therefore I am. I Think

Paper • 2604.01202 • Published Apr 2 • 33
FlowSlider: Training-Free Continuous Image Editing via Fidelity-Steering Decomposition

Paper • 2604.02088 • Published Apr 2 • 6
Signals: Trajectory Sampling and Triage for Agentic Interactions

Paper • 2604.00356 • Published Apr 1 • 9
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183
Consistency Amplifies: How Behavioral Variance Shapes Agent Accuracy

Paper • 2603.25764 • Published Mar 26 • 5
Multi-User Large Language Model Agents

Paper • 2604.08567 • Published Mar 19 • 27
Semantic Richness or Geometric Reasoning? The Fragility of VLM's Visual Invariance

Paper • 2604.01848 • Published Apr 3 • 5
EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers

Paper • 2604.09130 • Published Apr 10 • 4
ELT: Elastic Looped Transformers for Visual Generation

Paper • 2604.09168 • Published Apr 10 • 24
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details

Paper • 2604.06870 • Published Apr 8 • 44
Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Paper • 2604.10905 • Published Apr 13 • 29
From Reasoning to Agentic: Credit Assignment in Reinforcement Learning for Large Language Models

Paper • 2604.09459 • Published Apr 13 • 14
Continuous Adversarial Flow Models

Paper • 2604.11521 • Published Apr 13 • 12
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2604.12374 • Published Apr 14 • 37
Many-Tier Instruction Hierarchy in LLM Agents

Paper • 2604.09443 • Published Apr 10 • 16
Accelerating Speculative Decoding with Block Diffusion Draft Trees

Paper • 2604.12989 • Published Apr 14 • 8
LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling

Paper • 2604.11748 • Published Apr 15 • 14
Elucidating the SNR-t Bias of Diffusion Probabilistic Models

Paper • 2604.16044 • Published Apr 17 • 73
Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Paper • 2502.07408 • Published Apr 16 • 59
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

Paper • 2604.19667 • Published Apr 21 • 23
CityRAG: Stepping Into a City via Spatially-Grounded Video Generation

Paper • 2604.19741 • Published Apr 21 • 17
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

Paper • 2604.13602 • Published Apr 15 • 32
Image Generators are Generalist Vision Learners

Paper • 2604.20329 • Published Apr 22 • 22
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71
Recursive Multi-Agent Systems

Paper • 2604.25917 • Published Apr 28 • 287
Efficient Training on Multiple Consumer GPUs with RoundPipe

Paper • 2604.27085 • Published Apr 29 • 47
Representation Fréchet Loss for Visual Generation

Paper • 2604.28190 • Published Apr 30 • 32
Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Paper • 2604.24954 • Published Apr 27 • 26
Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Paper • 2604.28181 • Published Apr 30 • 20
Learning from Noisy Preferences: A Semi-Supervised Learning Approach to Direct Preference Optimization

Paper • 2604.24952 • Published Apr 27 • 6
ViPO: Visual Preference Optimization at Scale

Paper • 2604.24953 • Published Apr 29 • 3
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills

Paper • 2604.24026 • Published Apr 27 • 22
Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 85
Leveraging Verifier-Based Reinforcement Learning in Image Editing

Paper • 2604.27505 • Published Apr 30 • 59

Upvote

Collection guide
Browse collections