1 13

Green skin

Green-skin

AI & ML interests

None yet

Recent Activity

upvoted a paper about 16 hours ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

liked a dataset about 16 hours ago

zhiminy/EvalEng

liked a dataset 4 months ago

SWE-Arena/vote_data

View all activity

Organizations

None yet

upvoted a paper about 16 hours ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published 6 days ago • 6

liked a dataset about 16 hours ago

zhiminy/EvalEng

Viewer • Updated about 16 hours ago • 19.6k • 5

liked 4 datasets 4 months ago

liked 3 Spaces 4 months ago

README

🔬

SWE-Community

🌐

Track GitHub community statistics for SWE assistants

SWE-Release

📢

Track GitHub releases statistics for SWE assistants

liked 5 Spaces 7 months ago

Awesome Foundation Model Leaderboard Search

💻

The search tool of Awesome Foundation Model Leaderboard List

Awesome Production Machine Learning Search

🔥

The search tool of Awesome Production Machine Learning

SWE-PR

⚙

Track GitHub PR, review & commit stats for SWE agents

SWE-Issue

❓

Track GitHub issue statistics for SWE assistants

SWE-Chatbot-Arena

🎯

Chatbot arena for software engineering tasks