💼 Hiring

Zhangchen Xu PRO

zhangchenxu

22 47 192

https://zhangchenxu.com/

AI & ML interests

LLM Data, Alignment, Post-Training, Safety

Recent Activity

authored a paper 15 days ago

SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

authored a paper 15 days ago

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

authored a paper 15 days ago

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

View all activity

Organizations

upvoted a paper 28 days ago

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

Paper • 2606.05080 • Published 29 days ago • 30

upvoted a paper about 2 months ago

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Paper • 2605.12684 • Published May 12 • 11

upvoted a paper 3 months ago

Emergent Social Intelligence Risks in Generative Multi-Agent Systems

Paper • 2603.27771 • Published Mar 29 • 52

upvoted an article 4 months ago

Article

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

zhangchenxu

•

Feb 25

• 14

upvoted a paper 5 months ago

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Paper • 2601.12294 • Published Jan 18 • 19

upvoted a paper 8 months ago

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20, 2025 • 124

upvoted a paper 9 months ago

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Paper • 2510.09781 • Published Oct 10, 2025 • 27

upvoted an article 9 months ago

Article

BigCodeArena: Judging code generations end to end with code executions

bigcode

•

Oct 7, 2025

• 21

upvoted 3 papers 9 months ago

upvoted an article 12 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted a paper about 1 year ago

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 69

upvoted a collection about 1 year ago

TinyV

Collection

8 items • Updated Jun 22, 2025 • 1

upvoted 3 papers about 1 year ago

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL

Paper • 2505.23977 • Published May 29, 2025 • 10

Personalized Safety in LLMs: A Benchmark and A Planning-Based Agent Approach

Paper • 2505.18882 • Published May 24, 2025 • 15

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

Paper • 2505.14625 • Published May 20, 2025 • 13

upvoted an article over 1 year ago

Article

Open R1: Update #3

open-r1

•

Mar 11, 2025

• 298

upvoted a paper over 1 year ago

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published Mar 4, 2025 • 34

upvoted a collection over 1 year ago

KodCode-V1

Collection

KodCode-V1 is the largest fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. • 5 items • Updated Mar 2 • 5

Zhangchen Xu PRO

AI & ML interests

Recent Activity

Organizations

zhangchenxu's activity

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

BigCodeArena: Judging code generations end to end with code executions

SmolLM3: smol, multilingual, long-context reasoner

Open R1: Update #3