2 23 2

Renjie

Renjie-Ranger

https://renjie-ranger.github.io/

AI & ML interests

LLM Post-Training

Recent Activity

upvoted a paper 5 days ago

Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling

upvoted a paper 8 days ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

updated a model about 1 month ago

Renjie-Ranger/paper-step_general_reasoner_summary_CFT

View all activity

Organizations

None yet

upvoted a paper 5 days ago

Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling

Paper • 2604.27039 • Published 7 days ago • 22

upvoted a paper 8 days ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

Paper • 2604.23781 • Published 10 days ago • 32

upvoted a paper about 2 months ago

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published Mar 16 • 186

upvoted a paper 4 months ago

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 201

upvoted a collection 4 months ago

Feedback_Conditional_Policy

Collection

Collections for the paper "Language Models Can Learn from Verbal Feedback Without Scalar Rewards" (https://arxiv.org/pdf/2509.22638) • 7 items • Updated Jan 5 • 1

upvoted 4 papers 5 months ago

Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics

Paper • 2512.12602 • Published Dec 14, 2025 • 44

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

Paper • 2512.02581 • Published Dec 2, 2025 • 15

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 32

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 132

upvoted a paper 6 months ago

The Principles of Diffusion Models

Paper • 2510.21890 • Published Oct 24, 2025 • 64

upvoted a collection 6 months ago

Long_CoT_Degradation_SFT

Collection

Checkpoint for Long CoT Degradation • 59 items • Updated Mar 2 • 2

upvoted a paper 6 months ago

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27, 2025 • 98

upvoted 8 papers 7 months ago

QueST: Incentivizing LLMs to Generate Difficult Problems

Paper • 2510.17715 • Published Oct 20, 2025 • 35

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13, 2025 • 182

Imperceptible Jailbreaking against Large Language Models

Paper • 2510.05025 • Published Oct 6, 2025 • 33

TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Paper • 2505.12891 • Published May 19, 2025 • 10

Renjie

AI & ML interests

Recent Activity

Organizations

Renjie-Ranger's activity