doosaganesh's picture
Upload README.md
8595a08 verified
metadata
title: Git Conflict Resolver
emoji: πŸ”€
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
tags:
  - openenv

πŸ”€ Git Conflict Resolver β€” OpenEnv Environment

An RL environment where AI agents learn to resolve Git merge conflicts.
Built for the OpenEnv Hackathon (Meta Γ— Hugging Face Γ— Scaler).


🧠 Environment Description & Motivation

Merge conflicts are a daily reality for every software team. Resolving them correctly requires understanding both the syntactic structure of code and the semantic intent of diverging branches β€” a genuinely hard task for AI agents.

This environment presents agents with real Python files containing <<<<<<< HEAD, =======, and >>>>>>> conflict markers. The agent must produce a clean, fully resolved file β€” no markers, valid syntax, and correct logic.

Unlike toy environments, this simulates:

  • Easy conflicts: Accept an obvious incoming change (e.g. updated timeout value)
  • Medium conflicts: Apply different resolution strategies per conflict block
  • Hard conflicts: Combine additive changes from both branches (not just pick one)

πŸ“ Action & Observation Space

Observation Space (structured JSON)

Field Type Description
task_name string Current task identifier
task_description string Natural language instructions for the agent
filename string Name of the file being resolved
file_language string Language of the file (python, text)
conflicted_content string Full file content with conflict markers
branch_ours string Name of the HEAD (current) branch
branch_theirs string Name of the incoming branch
num_conflicts integer Number of <<<<<<< blocks in the file
last_attempt string | null Agent's previous resolution (for retry)
last_error string | null Grading feedback from last step
step integer Current step number
max_steps integer Maximum allowed steps (10)
done boolean Whether the episode is finished

Action Space

{
  "resolved_content": "<full resolved file as a string>"
}

The agent outputs the complete file content with all conflict markers removed.


πŸ“‹ Task Descriptions

Task 1: single_conflict β€” Easy

  • File: config.py
  • Conflicts: 1 block
  • Description: A timeout value was changed from 30s to 60s on a feature branch. The agent must accept the incoming change.
  • Expected difficulty: Any capable LLM should solve this in 1–2 steps.

Task 2: multi_conflict β€” Medium

  • File: user_service.py
  • Conflicts: 3 blocks
  • Description: Authentication was refactored. Each block requires a different resolution: accept new import, keep original constant, accept new function implementation.
  • Expected difficulty: Requires reading context across blocks.

Task 3: logic_conflict β€” Hard

  • File: data_pipeline.py
  • Conflicts: 2 blocks
  • Description: Both branches added valid, additive features. The agent must combine them β€” not simply pick one side. Requires understanding code semantics.
  • Expected difficulty: Frontier models (GPT-4, Qwen-72B) score ~0.5–0.7 without specific tuning.

πŸ† Reward Function

The reward is shaped β€” agents get feedback at every step, not just at the end.

Signal Value Trigger
Improvement bonus +0.75 Γ— score_delta Score improves over previous step
Marker-free bonus +0.10 First time no markers remain
Perfect match bonus +0.25 Score reaches 1.0
Stagnation penalty -0.10 Identical submission as previous step
Step cost -0.01 Γ— (step/max_steps) Every step

Grading Breakdown (per step)

Component Score Criterion
no_markers 0.25 No <<<<<<<, =======, >>>>>>> in output
valid_syntax 0.25 File parses as valid Python (AST check)
similarity 0.25 Fuzzy match ratio vs. expected resolution
exact_match 0.25 Character-exact match with expected output

πŸš€ Setup & Usage

Local Setup

# 1. Install dependencies
pip install -r server/requirements.txt

# 2. Start the server
uvicorn server.main:app --host 0.0.0.0 --port 7860 --app-dir server

# 3. Test the endpoints
curl -X POST http://localhost:7860/reset \
  -H "Content-Type: application/json" \
  -d '{"task": "single_conflict"}'

curl -X POST http://localhost:7860/step \
  -H "Content-Type: application/json" \
  -d '{"resolved_content": "# your resolved content here"}'

curl http://localhost:7860/state

Docker

# Build
docker build -t git-conflict-resolver .

# Run
docker run -p 7860:7860 git-conflict-resolver

# Verify
curl http://localhost:7860/health

Run Baseline Inference

export HF_TOKEN=your_token_here
export API_BASE_URL=https://router.huggingface.co/v1
export MODEL_NAME=Qwen/Qwen2.5-72B-Instruct
export ENV_URL=http://localhost:7860

python inference.py

πŸ“Š Baseline Scores

Scores obtained using Qwen/Qwen2.5-72B-Instruct via HF Inference API.

Task Score Steps Success
single_conflict 1.00 1 βœ…
multi_conflict 0.75 3 ❌
logic_conflict 0.50 5 ❌
Average 0.75 β€” β€”

πŸ“ Project Structure

openenv_hackathon/
β”œβ”€β”€ server/
β”‚   β”œβ”€β”€ main.py          # FastAPI server β€” /reset /step /state /health
β”‚   β”œβ”€β”€ env.py           # Core environment logic (reset/step/state/close)
β”‚   β”œβ”€β”€ models.py        # Pydantic Observation, Action, Reward models
β”‚   β”œβ”€β”€ tasks.py         # Task definitions (3 tasks with conflict content)
β”‚   β”œβ”€β”€ graders.py       # Deterministic graders (marker, AST, similarity, exact)
β”‚   β”œβ”€β”€ reward.py        # Shaped reward function
β”‚   └── requirements.txt
β”œβ”€β”€ inference.py         # Baseline inference script (root β€” required)
β”œβ”€β”€ openenv.yaml         # OpenEnv metadata (required for openenv validate)
β”œβ”€β”€ Dockerfile           # Container build (port 7860 for HF Spaces)
└── README.md

πŸ”§ API Reference

POST /reset

Start a new episode.

{ "task": "single_conflict" }

Returns: ConflictObservation

POST /step

Submit a conflict resolution.

{ "resolved_content": "# full resolved file..." }

Returns: { observation, reward, done, info }

GET /state

Returns current episode state (step, total_reward, history).

GET /health

Returns { "status": "ok" } β€” used for HF Space validation.

GET /tasks

Returns { "tasks": ["single_conflict", "multi_conflict", "logic_conflict"] }


πŸ‘₯ Team

Agent Smith β€” OpenEnv Hackathon, April 2026

  • Ganesh Doosa (Team Lead)
  • Gajula Akanksha
  • Yashwanth Kumar