👋 Open to Work

Zhimin Zhao PRO

zhiminy

https://zhimin-z.github.io

AI & ML interests

SE4AI, AI4SE, LLMOps, LLM4Code

Recent Activity

liked a model 4 days ago

SanDiegoDude/Cosmos3-Nano-nf4

liked a Space 10 days ago

nvidia/Cosmos3-Action-Viewer

upvoted a collection 10 days ago

Cosmos3

View all activity

Organizations

liked a model 4 days ago

SanDiegoDude/Cosmos3-Nano-nf4

Text-to-Image • Updated 22 days ago • 313 • 2

liked a Space 10 days ago

Cosmos3 Action Viewer

🤖

Explore interactive visualizations with Viser

upvoted a collection 10 days ago

Cosmos3

Collection

Omnimodal World Models for Physical AI • 15 items • Updated 13 days ago • 131

liked a model 22 days ago

nvidia/Cosmos3-Super

65B • Updated 2 days ago • 78.3k • 184

upvoted 2 papers 25 days ago

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

Paper • 2407.04065 • Published Jul 4, 2024 • 10

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Paper • 2605.29801 • Published 28 days ago • 144

authored 2 papers 27 days ago

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

Paper • 2407.04065 • Published Jul 4, 2024 • 10

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 14

updated 3 datasets 28 days ago

updated a Space 28 days ago

SWE-Agent-Arena

⚔

Agent arena for software engineering tasks

updated a dataset 28 days ago

SWE-Arena/cli_data

Viewer • Updated 28 days ago • 7 • 53 • 1

liked a dataset 29 days ago

zhiminy/EvalEng

Viewer • Updated 29 days ago • 19.6k • 163 • 5

updated a dataset 29 days ago

zhiminy/EvalEng

Viewer • Updated 29 days ago • 19.6k • 163 • 5

published a dataset 29 days ago

zhiminy/EvalEng

Viewer • Updated 29 days ago • 19.6k • 163 • 5

upvoted a paper 29 days ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 14

submitted a paper to Daily Papers 29 days ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 14

upvoted a changelog 2 months ago

Hugging Face Changelog

Introducing Kernels

Apr 15

• 200

updated a Space 3 months ago

SWE-Chatbot-Arena

🎯

Chatbot arena for software engineering tasks

Zhimin Zhao PRO

AI & ML interests

Recent Activity

Organizations

zhiminy's activity

Cosmos3 Action Viewer

SWE-Agent-Arena

Introducing Kernels

SWE-Chatbot-Arena