reward-scaling

non-profit

AI & ML interests

None defined yet.

Recent Activity

hlzhang109 authored a paper about 1 month ago

Towards Principled Disentanglement for Domain Generalization

hlzhang109 authored a paper about 1 month ago

Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation

hlzhang109 authored a paper about 1 month ago

The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning

View all activity

authored 9 papers about 1 month ago

Towards Principled Disentanglement for Domain Generalization

Paper • 2111.13839 • Published Nov 27, 2021

Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation

Paper • 2202.01336 • Published Feb 2, 2022

The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning

Paper • 2212.08686 • Published Dec 16, 2022

How Does Critical Batch Size Scale in Pre-training?

Paper • 2410.21676 • Published Oct 29, 2024

Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning

Paper • 2506.10378 • Published Jun 12, 2025 • 2

EvoLM: In Search of Lost Language Model Training Dynamics

Paper • 2506.16029 • Published Jun 19, 2025

AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?

Paper • 2507.15887 • Published Jul 19, 2025

Weight Decay Improves Language Model Plasticity

Paper • 2602.11137 • Published Feb 11 • 2

Scaling Reward Modeling without Human Supervision

Paper • 2603.02225 • Published Feb 11

authored a paper about 2 months ago

Scaling Reward Modeling without Human Supervision

Paper • 2603.02225 • Published Feb 11

authored a paper about 2 months ago

Prescriptive Scaling Reveals the Evolution of Language Model Capabilities

Paper • 2602.15327 • Published Feb 17 • 3

submitted a paper to Daily Papers about 2 months ago

Prescriptive Scaling Reveals the Evolution of Language Model Capabilities

Paper • 2602.15327 • Published Feb 17 • 3

submitted a paper to Daily Papers 2 months ago

Weight Decay Improves Language Model Plasticity

Paper • 2602.11137 • Published Feb 11 • 2

authored 4 papers 2 months ago

HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics

Paper • 2410.09988 • Published Oct 13, 2024

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24, 2025 • 77

User-Assistant Bias in LLMs

Paper • 2508.15815 • Published Aug 16, 2025

Diffusion-Inspired Masked Fine-Tuning for Knowledge Injection in Autoregressive LLMs

Paper • 2510.09885 • Published Oct 10, 2025

updated a dataset 2 months ago

reward-scaling/infiwebmath_1536chunk_part1_part2_11M

Viewer • Updated Feb 10 • 7k • 12

published a dataset 2 months ago

reward-scaling/infiwebmath_1536chunk_part1_part2_11M

Viewer • Updated Feb 10 • 7k • 12

updated a dataset 2 months ago

reward-scaling/finemath_1536chunk_part1_part2_11M

Viewer • Updated Feb 10 • 7k • 9