Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
princeton-nlp 's Collections
RLMT Experiments
SimPO
SWE-bench
ProLong
Sheared Llama
SimCSE

SWE-bench

updated Mar 8, 2025

SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues.

Upvote
10

  • princeton-nlp/SWE-bench

    Viewer • Updated Mar 3, 2025 • 21.5k • 40.5k • 140

  • princeton-nlp/SWE-bench_Lite

    Viewer • Updated Mar 3, 2025 • 323 • 101k • 60

  • princeton-nlp/SWE-bench_Multimodal

    Viewer • Updated Jan 13, 2025 • 612 • 4.24k • 21

  • princeton-nlp/SWE-bench_Verified

    Viewer • Updated Feb 18, 2025 • 500 • 800k • 343
Upvote
10
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs