Xiao Liu's picture

Xiao Liu

lx865712528

·

https://xiaoliunlc.github.io/

AI & ML interests

NLP, LLM and reasoning

Recent Activity

liked a dataset 29 days ago

nvidia/Nemotron-CC-v2.1

upvoted a paper 3 months ago

Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory

authored a paper 3 months ago

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

View all activity

Organizations

upvoted 5 papers 3 months ago

Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory

Paper • 2602.15313 • Published Feb 17 • 3

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published Feb 9 • 44

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published Oct 30, 2025 • 29

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32

Chain Of Thought Compression: A Theoritical Analysis

Paper • 2601.21576 • Published Jan 29 • 20

upvoted a paper 6 months ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published Dec 3, 2025 • 159

upvoted 4 papers 7 months ago

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data

Paper • 2510.25804 • Published Oct 29, 2025 • 1

Knocking-Heads Attention

Paper • 2510.23052 • Published Oct 27, 2025 • 30

Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection

Paper • 2510.18909 • Published Oct 21, 2025 • 5

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Paper • 2510.08008 • Published Oct 9, 2025 • 6

upvoted a paper 8 months ago

Behind RoPE: How Does Causal Mask Encode Positional Information?

Paper • 2509.21042 • Published Sep 25, 2025 • 9

upvoted a paper 10 months ago

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training

Paper • 2507.15640 • Published Jul 21, 2025 • 5

upvoted a paper 12 months ago

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Paper • 2506.02678 • Published Jun 3, 2025 • 5

upvoted 3 papers about 1 year ago

GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks

Paper • 2502.14848 • Published Feb 20, 2025 • 1

Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling

Paper • 2503.19123 • Published Mar 24, 2025 • 2

Process-based Self-Rewarding Language Models

Paper • 2503.03746 • Published Mar 5, 2025 • 39

upvoted 3 papers over 1 year ago

Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published Jan 28, 2025 • 36

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published Jan 23, 2025 • 48

EpiCoder: Encompassing Diversity and Complexity in Code Generation

Paper • 2501.04694 • Published Jan 8, 2025 • 18