🏗️ Building on HF

Nathan Habib PRO

SaylorTwift

huggingface

·

AI & ML interests

Evals

Recent Activity

updated a bucket about 6 hours ago

SaylorTwift/harbor-jobs

published a bucket about 6 hours ago

SaylorTwift/harbor-jobs

new activity about 10 hours ago

skt/A.X-K1:Add evaluation results

View all activity

Organizations

New activity in skt/A.X-K1 about 10 hours ago

Add evaluation results

#8 opened 4 days ago by

New activity in deepreinforce-ai/Ornith-1.0-397B 3 days ago

Add evaluation results

#5 opened 3 days ago by

New activity in deepreinforce-ai/Ornith-1.0-9B 3 days ago

Add evaluation results (including Claw-eval)

#3 opened 3 days ago by

New activity in deepreinforce-ai/Ornith-1.0-35B 3 days ago

Add evaluation results (including Claw-eval)

#8 opened 3 days ago by

New activity in deepreinforce-ai/Ornith-1.0-397B 3 days ago

Add evaluation results (including Claw-eval)

#6 opened 3 days ago by

New activity in deepreinforce-ai/Ornith-1.0-9B 3 days ago

Add evaluation results

#2 opened 3 days ago by

New activity in deepreinforce-ai/Ornith-1.0-35B 3 days ago

Add evaluation results

#7 opened 3 days ago by

New activity in LiquidAI/LFM2.5-230M 3 days ago

Add evaluation results for GPQA and MMLU-Pro

#6 opened 3 days ago by

Add evaluation results for GPQA and MMLU-Pro

#7 opened 3 days ago by

Add evaluation results for GPQA and MMLU-Pro

#5 opened 3 days ago by

Update evaluation results — remove comments

#4 opened 3 days ago by

Add evaluation results

#3 opened 3 days ago by

New activity in WeiboAI/VibeThinker-3B 6 days ago

Add evaluation results from model card benchmark tables

#21 opened 6 days ago by

New activity in Qwen/Qwen3.6-27B 6 days ago

Update Terminal Bench 2.0 evaluation notes with methodology details

#45 opened 6 days ago by

New activity in allenai/tmax-4b 6 days ago

Add Terminal Bench 2.0 evaluation result

#1 opened 6 days ago by

New activity in allenai/tmax-9b 6 days ago

Add Terminal Bench 2.0 evaluation result

#1 opened 6 days ago by

New activity in allenai/tmax-2b 6 days ago

Add Terminal Bench 2.0 evaluation result

#1 opened 6 days ago by

New activity in MiniMaxAI/MiniMax-M3 6 days ago

Fix YC-Bench score: use full value 2100000 instead of 2.1

#18 opened 6 days ago by

Add evaluation results from model card benchmark graph

#17 opened 6 days ago by

New activity in zai-org/GLM-5.2 6 days ago

glm-5.2-RedlineBench

#19 opened 7 days ago by