Qihan Ren

jasonrqh

8 51 9

https://nebularaid2000.github.io/

AI & ML interests

XAI, LLM reasoning & safety, Coding agent

Recent Activity

upvoted a paper 5 days ago

Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent

upvoted a paper 9 days ago

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

upvoted a paper 16 days ago

FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining

View all activity

Organizations

upvoted a paper 5 days ago

Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent

Paper • 2606.30616 • Published 7 days ago • 86

upvoted a paper 9 days ago

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Paper • 2606.26300 • Published 12 days ago • 47

upvoted a paper 16 days ago

FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining

Paper • 2606.20506 • Published 18 days ago • 28

upvoted 2 papers 23 days ago

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

Paper • 2606.13473 • Published 25 days ago • 92

MiniMax Sparse Attention

Paper • 2606.13392 • Published 25 days ago • 148

upvoted 3 papers about 1 month ago

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Paper • 2605.31264 • Published May 29 • 123

The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

Paper • 2605.26494 • Published May 26 • 41

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Paper • 2605.29801 • Published May 28 • 144

upvoted 3 papers about 2 months ago

MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published May 14 • 122

OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

Paper • 2605.04036 • Published May 5 • 72

ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration

Paper • 2605.03042 • Published May 4 • 141

upvoted 3 papers 3 months ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 168

Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization

Paper • 2604.09574 • Published Feb 24 • 30

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

upvoted an article 3 months ago

Article

Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model

georgefen

•

Jan 1

• 19

upvoted 5 papers 3 months ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 329

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Paper • 2603.24472 • Published Mar 25 • 57

On the Role of Reasoning Patterns in the Generalization Discrepancy of Long Chain-of-Thought Supervised Fine-Tuning

Paper • 2604.01702 • Published Apr 4 • 3

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

Paper • 2604.08544 • Published Apr 9 • 16

ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety

Paper • 2604.02022 • Published Apr 2 • 15

Qihan Ren

AI & ML interests

Recent Activity

Organizations

jasonrqh's activity

Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model