---
title: Git Conflict Resolver
emoji: 🔀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
tags: ["openenv"]
---
# 🔀 Git Conflict Resolver — OpenEnv Environment

An RL environment where AI agents learn to resolve Git merge conflicts.  
Built for the **OpenEnv Hackathon** (Meta × Hugging Face × Scaler).

---

## 🧠 Environment Description & Motivation

Merge conflicts are a daily reality for every software team. Resolving them correctly requires understanding both the syntactic structure of code **and** the semantic intent of diverging branches — a genuinely hard task for AI agents.

This environment presents agents with real Python files containing `<<<<<<< HEAD`, `=======`, and `>>>>>>>` conflict markers. The agent must produce a clean, fully resolved file — no markers, valid syntax, and correct logic.

Unlike toy environments, this simulates:
- **Easy conflicts**: Accept an obvious incoming change (e.g. updated timeout value)
- **Medium conflicts**: Apply different resolution strategies per conflict block
- **Hard conflicts**: Combine additive changes from *both* branches (not just pick one)

---

## 📐 Action & Observation Space

### Observation Space (structured JSON)

| Field | Type | Description |
|-------|------|-------------|
| `task_name` | string | Current task identifier |
| `task_description` | string | Natural language instructions for the agent |
| `filename` | string | Name of the file being resolved |
| `file_language` | string | Language of the file (`python`, `text`) |
| `conflicted_content` | string | Full file content with conflict markers |
| `branch_ours` | string | Name of the HEAD (current) branch |
| `branch_theirs` | string | Name of the incoming branch |
| `num_conflicts` | integer | Number of `<<<<<<<` blocks in the file |
| `last_attempt` | string \| null | Agent's previous resolution (for retry) |
| `last_error` | string \| null | Grading feedback from last step |
| `step` | integer | Current step number |
| `max_steps` | integer | Maximum allowed steps (10) |
| `done` | boolean | Whether the episode is finished |

### Action Space

```json
{
  "resolved_content": "<full resolved file as a string>"
}
```

The agent outputs the **complete file content** with **all conflict markers removed**.

---

## 📋 Task Descriptions

### Task 1: `single_conflict` — Easy
- **File:** `config.py`
- **Conflicts:** 1 block
- **Description:** A timeout value was changed from 30s to 60s on a feature branch. The agent must accept the incoming change.
- **Expected difficulty:** Any capable LLM should solve this in 1–2 steps.

### Task 2: `multi_conflict` — Medium
- **File:** `user_service.py`
- **Conflicts:** 3 blocks
- **Description:** Authentication was refactored. Each block requires a different resolution: accept new import, keep original constant, accept new function implementation.
- **Expected difficulty:** Requires reading context across blocks.

### Task 3: `logic_conflict` — Hard
- **File:** `data_pipeline.py`
- **Conflicts:** 2 blocks
- **Description:** Both branches added valid, additive features. The agent **must combine** them — not simply pick one side. Requires understanding code semantics.
- **Expected difficulty:** Frontier models (GPT-4, Qwen-72B) score ~0.5–0.7 without specific tuning.

---

## 🏆 Reward Function

The reward is **shaped** — agents get feedback at every step, not just at the end.

| Signal | Value | Trigger |
|--------|-------|---------|
| Improvement bonus | `+0.75 × score_delta` | Score improves over previous step |
| Marker-free bonus | `+0.10` | First time no markers remain |
| Perfect match bonus | `+0.25` | Score reaches 1.0 |
| Stagnation penalty | `-0.10` | Identical submission as previous step |
| Step cost | `-0.01 × (step/max_steps)` | Every step |

### Grading Breakdown (per step)

| Component | Score | Criterion |
|-----------|-------|-----------|
| `no_markers` | 0.25 | No `<<<<<<<`, `=======`, `>>>>>>>` in output |
| `valid_syntax` | 0.25 | File parses as valid Python (AST check) |
| `similarity` | 0.25 | Fuzzy match ratio vs. expected resolution |
| `exact_match` | 0.25 | Character-exact match with expected output |

---

## 🚀 Setup & Usage

### Local Setup

```bash
# 1. Install dependencies
pip install -r server/requirements.txt

# 2. Start the server
uvicorn server.main:app --host 0.0.0.0 --port 7860 --app-dir server

# 3. Test the endpoints
curl -X POST http://localhost:7860/reset \
  -H "Content-Type: application/json" \
  -d '{"task": "single_conflict"}'

curl -X POST http://localhost:7860/step \
  -H "Content-Type: application/json" \
  -d '{"resolved_content": "# your resolved content here"}'

curl http://localhost:7860/state
```

### Docker

```bash
# Build
docker build -t git-conflict-resolver .

# Run
docker run -p 7860:7860 git-conflict-resolver

# Verify
curl http://localhost:7860/health
```

### Run Baseline Inference

```bash
export HF_TOKEN=your_token_here
export API_BASE_URL=https://router.huggingface.co/v1
export MODEL_NAME=Qwen/Qwen2.5-72B-Instruct
export ENV_URL=http://localhost:7860

python inference.py
```

---

## 📊 Baseline Scores

> Scores obtained using `Qwen/Qwen2.5-72B-Instruct` via HF Inference API.

| Task | Score | Steps | Success |
|------|-------|-------|---------|
| `single_conflict` | 1.00 | 1 | ✅ |
| `multi_conflict` | 0.75 | 3 | ❌ |
| `logic_conflict` | 0.50 | 5 | ❌ |
| **Average** | **0.75** | — | — |

---

## 📁 Project Structure

```
openenv_hackathon/
├── server/
│   ├── main.py          # FastAPI server — /reset /step /state /health
│   ├── env.py           # Core environment logic (reset/step/state/close)
│   ├── models.py        # Pydantic Observation, Action, Reward models
│   ├── tasks.py         # Task definitions (3 tasks with conflict content)
│   ├── graders.py       # Deterministic graders (marker, AST, similarity, exact)
│   ├── reward.py        # Shaped reward function
│   └── requirements.txt
├── inference.py         # Baseline inference script (root — required)
├── openenv.yaml         # OpenEnv metadata (required for openenv validate)
├── Dockerfile           # Container build (port 7860 for HF Spaces)
└── README.md
```

---

## 🔧 API Reference

### `POST /reset`
Start a new episode.
```json
{ "task": "single_conflict" }
```
Returns: `ConflictObservation`

### `POST /step`
Submit a conflict resolution.
```json
{ "resolved_content": "# full resolved file..." }
```
Returns: `{ observation, reward, done, info }`

### `GET /state`
Returns current episode state (step, total_reward, history).

### `GET /health`
Returns `{ "status": "ok" }` — used for HF Space validation.

### `GET /tasks`
Returns `{ "tasks": ["single_conflict", "multi_conflict", "logic_conflict"] }`

---

## 👥 Team

**Agent Smith** — OpenEnv Hackathon, April 2026

- Ganesh Doosa (Team Lead)
- Gajula Akanksha
- Yashwanth Kumar