24 17

P7n2c1dvlqk6

p7n2c1dvlqk6

AI & ML interests

None yet

Recent Activity

liked a model about 3 hours ago

Rifky/netra-smoke3

liked a dataset 1 day ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

upvoted a paper 4 days ago

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs

View all activity

Organizations

None yet

upvoted a paper 4 days ago

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs

Paper • 2605.24681 • Published 12 days ago • 5

upvoted a paper 5 days ago

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

Paper • 2605.28816 • Published 8 days ago • 419

upvoted a paper 7 days ago

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

Paper • 2605.25604 • Published 10 days ago • 134

upvoted a paper 12 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 23 days ago • 195

upvoted a paper 13 days ago

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published 15 days ago • 204

upvoted a paper 14 days ago

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

Paper • 2605.19660 • Published 16 days ago • 40

upvoted a paper 24 days ago

Who Prices Cognitive Labor in the Age of Agents? Compute-Anchored Wages

Paper • 2605.05558 • Published 27 days ago • 3

upvoted a paper about 1 month ago

Leveraging Verifier-Based Reinforcement Learning in Image Editing

Paper • 2604.27505 • Published Apr 30 • 57

upvoted 3 papers about 2 months ago

Experience Transfer for Multimodal LLM Agents in Minecraft Game

Paper • 2604.05533 • Published Apr 7 • 16

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 504

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 630

upvoted 4 papers 2 months ago

KAT-Coder-V2 Technical Report

Paper • 2603.27703 • Published Mar 29 • 12

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Paper • 2603.27460 • Published Mar 29 • 70

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published Mar 30 • 343

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 352

upvoted 5 papers 3 months ago

Efficient Reasoning with Balanced Thinking

Paper • 2603.12372 • Published Mar 12 • 150

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published Mar 17 • 311

HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions

Paper • 2603.15612 • Published Mar 16 • 153

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 211

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 221

P7n2c1dvlqk6

AI & ML interests

Recent Activity

Organizations

p7n2c1dvlqk6's activity