Zhimin Zhao's picture

👋 Open to Work

Zhimin Zhao PRO

zhiminy

·

https://zhimin-z.github.io

AI & ML interests

SE4AI, AI4SE, LLMOps, LLM4Code

Recent Activity

liked a model 5 days ago

SanDiegoDude/Cosmos3-Nano-nf4

liked a Space 12 days ago

nvidia/Cosmos3-Action-Viewer

upvoted a collection 12 days ago

View all activity

Organizations

upvoted a collection 12 days ago

Cosmos3

Omnimodal World Models for Physical AI • 16 items • Updated about 9 hours ago • 131

upvoted 2 papers 26 days ago

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

Paper • 2407.04065 • Published Jul 4, 2024 • 10

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Paper • 2605.29801 • Published 29 days ago • 144

upvoted a paper about 1 month ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 14

upvoted a changelog 2 months ago

Hugging Face Changelog

Introducing Kernels

Apr 15

• 201

upvoted 2 articles about 1 year ago

Article

Open LLM Leaderboard: DROP deep dive

+3

clefourrier, cabreraalex, stellaathena, SaylorTwift, thomwolf

•

Dec 1, 2023

• 11

Article

What's going on with the Open LLM Leaderboard?

+2

clefourrier, SaylorTwift, slippylolo, thomwolf

•

Jun 23, 2023

• 51

upvoted a paper about 1 year ago

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published Apr 1, 2025 • 43

upvoted an article over 1 year ago

Article

Let's talk about LLM evaluation

clefourrier

•

May 23, 2024

• 212

upvoted a collection over 1 year ago

Leaderboards and benchmarks ✨

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 88 items • Updated Mar 2 • 120

upvoted a paper over 2 years ago

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 249

upvoted 2 collections over 2 years ago

Open LLM Leaderboard best models ❤️‍🔥

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 50 items • Updated Mar 13 • 694

The Big Benchmarks Collection

Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 13 items • Updated Nov 18, 2024 • 267