Xuejia Chen

Gresham429

·

https://gresham429.github.io/

Gresham429

AI & ML interests

llm

Recent Activity

upvoted a paper about 2 months ago

Auditing Agent Harness Safety

updated a dataset 12 months ago

TreeAILab/NumericBench

updated a dataset 12 months ago

TreeAILab/Multi-turn_Long-context_Benchmark_for_LLMs

View all activity

Organizations

upvoted a paper about 2 months ago

Auditing Agent Harness Safety

Paper • 2605.14271 • Published May 14 • 55

updated 2 datasets 12 months ago

TreeAILab/NumericBench

Viewer • Updated Aug 1, 2025 • 43.3k • 228 • 1

TreeAILab/Multi-turn_Long-context_Benchmark_for_LLMs

Viewer • Updated Aug 1, 2025 • 7.25k • 140

New activity in TreeAILab/Multi-turn_Long-context_Benchmark_for_LLMs 12 months ago

Improve dataset card: Add library_name, license, benchmark tag, GitHub link, and sample usage

#3 opened 12 months ago by

Improve dataset card: Add paper link, update name, expand configs, and enhance description

#1 opened 12 months ago by

published a dataset 12 months ago

TreeAILab/Multi-turn_Long-context_Benchmark_for_LLMs

Viewer • Updated Aug 1, 2025 • 7.25k • 140

liked a dataset over 1 year ago

TreeAILab/NumericBench

Viewer • Updated Aug 1, 2025 • 43.3k • 228 • 1

updated 2 datasets over 1 year ago

TreeAILab/Multi-turn_Long-context_Benchmark_for_LLMs

Viewer • Updated Aug 1, 2025 • 7.25k • 140

TreeAILab/Multi-turn_Long-context_Benchmark_for_LLMs

Viewer • Updated Aug 1, 2025 • 7.25k • 140