Spaces:

SWE-Arena
/

README

Running

App Files Files Community

zhiminy commited on Oct 26, 2025

Commit

8dcfb65

verified ·

1 Parent(s): 1133a11

Update README.md

Browse files

Files changed (1) hide show

README.md +99 -2

README.md CHANGED Viewed

@@ -7,6 +7,103 @@ sdk: static
 pinned: false
 ---
-## Software Engineering Arena
-We create transparent, community-driven evaluation platforms that measure AI agent performance across every stage of the software engineering process—from initial code generation through integration, review, and deployment.

 pinned: false
 ---
+<div align="center">
+# Software Engineering Arena
+</div>
+[Software Engineering Arena](https://huggingface.co/SWE-Arena) is an open-source initiative to transparently evaluate and track AI coding agents across real-world software engineering workflows. We provide interactive platforms, tracking systems, and novel metrics to advance the field of AI-assisted software development.
+**Welcome collaboration from research labs, independent contributors, and the broader SE community!**
+## 🏟️ [SWE-Model-Arena](https://github.com/Software-Engineering-Arena/SWE-Model-Arena): Interactive Multi-Round Model Evaluation
+An interactive platform for evaluating foundation models through **pairwise comparisons** in multi-round conversational workflows. Unlike static benchmarks, SWE-Model-Arena enables:
+- **Multi-round dialogues** reflecting real-world SE interactions
+- **Repository-aware context** via RepoChat for authentic evaluations
+- **Novel metrics** including model consistency score and conversation efficiency index
+- **Transparent, open-source leaderboard** with advanced ranking algorithms
+- **Code execution** across multiple languages in sandboxed environments
+Perfect for researchers and engineers seeking nuanced, context-aware assessments of AI models on software engineering tasks.
+[![SWE-Model-Arena](https://img.shields.io/badge/🏟️-Try%20SWE--Model--Arena-blue?style=for-the-badge)](https://huggingface.co/spaces/SWE-Arena/Software-Engineering-Arena)
+## 📊 GitHub-Based Agent Tracking Suite
+Evaluate AI coding agents through their actual GitHub activity with our comprehensive tracking systems:
+### [SWE-Commit](https://github.com/Software-Engineering-Arena/SWE-Commit)
+Track and analyze AI coding agents by their **GitHub commits**—measuring code quality, consistency, and contribution patterns.
+[![SWE-Commit](https://img.shields.io/badge/🏟️-Try%20SWE--Commit-red?style=for-the-badge)](https://huggingface.co/spaces/SWE-Arena/SWE-Commit)
+### [SWE-PR](https://github.com/Software-Engineering-Arena/SWE-PR)
+Assess AI agents via their **pull request workflows**—examining merge success rates, discussion quality, and iterative improvements.
+[![SWE-PR](https://img.shields.io/badge/🏟️-Try%20SWE--PR-purple?style=for-the-badge)](https://huggingface.co/spaces/SWE-Arena/SWE-PR)
+### [SWE-Review](https://github.com/Software-Engineering-Arena/SWE-Review)
+Evaluate AI agents through their **code review activity**—assessing feedback quality, issue identification, and collaborative capabilities.
+[![SWE-Review](https://img.shields.io/badge/🏟️-Try%20SWE--Review-green?style=for-the-badge)](https://huggingface.co/spaces/SWE-Arena/SWE-Review)
+### [SWE-Issue](https://github.com/Software-Engineering-Arena/SWE-Issue)
+Monitor how AI agents handle **issue tracking**—from bug reports to feature requests and documentation.
+[![SWE-Issue](https://img.shields.io/badge/🏟️-Try%20SWE--Issue-yellow?style=for-the-badge)](https://huggingface.co/spaces/SWE-Arena/SWE-Issue)
+## 🎯 Our Mission
+Software engineering extends far beyond code generation—it encompasses requirements engineering, collaborative design, code review, debugging, and project management. Current evaluation frameworks often focus narrowly on code completion or generation.
+**Software Engineering Arena** provides:
+- ✅ **Holistic evaluation** across diverse SE activities
+- ✅ **Multi-turn interactions** matching real-world workflows
+- ✅ **Transparent methodologies** for reproducible research
+- ✅ **Open-source tools** for community-driven innovation
+- ✅ **Rich datasets** to advance AI-assisted software development
+## 🤝 Get Involved
+We're actively seeking collaborators! Whether you're a:
+- 🔬 **Researcher** developing new evaluation metrics
+- 🛠️ **Engineer** building AI coding tools
+- 📊 **Data scientist** analyzing model performance
+- 🌐 **Open source contributor** improving our platforms
+**Ways to contribute:**
+- Submit PRs to enhance our evaluation platforms
+- Propose new metrics or tracking methodologies
+- Share datasets or evaluation results
+- Report issues and suggest improvements
+- Join discussions in our repositories
+## 📚 Learn More
+- 📄 **Paper**: [SWE Arena: An Interactive Platform for Evaluating Foundation Models in Software Engineering](https://arxiv.org/abs/2502.01860)
+- 🌐 **Platform**: [Try SWE-Model-Arena on Hugging Face](https://huggingface.co/spaces/SWE-Arena/SWE-Model-Arena)
+## 📄 License
+All projects under Software Engineering Arena are licensed under the **Apache 2.0 License**. Data collected and open-sourced follows the same license.
+<div align="center">
+**Building the future of AI-assisted software engineering, one evaluation at a time.**
+</div>