Spaces:

Angelgupta
/

mlops-openenv

Sleeping

App Files Files Community

mlops-openenv / ARCHITECTURE.md

trretretret

Deploy ML pipeline debugging environment to HF Spaces

7e782aa about 2 months ago

preview code

raw

history blame contribute delete

2.66 kB

Backend Architecture

Project Structure

MLops-Openenvhack/
├── app.py                 # FastAPI server - main entry point
├── inference.py           # Baseline LLM agent for evaluation
├── models.py              # Pydantic models (Action, Observation, State)
├── mlops_environment.py   # Core environment logic
├── artifact_generator.py  # Procedural bug/artifact generation
├── client.py              # Python client library
├── openenv.yaml           # OpenEnv specification
├── Dockerfile             # Container configuration
├── requirements.txt       # Python dependencies
└── README.md             # Documentation

How It Works

1. Server (app.py)

Runs FastAPI on port 7860
Provides REST endpoints:
- GET /health - Health check
- POST /reset - Initialize new task
- POST /step - Execute action
- GET /state - Get current state
- GET /tasks - List available tasks
- GET /openenv/state - OpenEnv state

2. Environment (mlops_environment.py)

Manages task state
Processes actions through _handle_* methods
Generates rewards based on agent behavior
Tracks artifacts read and sanity checks

3. Artifact Generator (artifact_generator.py)

Procedurally generates training artifacts with planted bugs
Creates realistic: logs, configs, preprocessing code, eval results
Supports 9 bug types across 3 difficulty levels

4. Inference Agent (inference.py)

LLM-powered agent using OpenAI API
Reads artifacts, runs sanity checks
Submits diagnosis with confidence scoring
Implements rate limiting and fallback

API Flow

Client -> app.py (FastAPI)
           |
           +-> mlops_environment.py (core logic)
                    |
                    +-> artifact_generator.py (bug generation)
                    |
                    +-> models.py (data validation)
                    |
                    +-> Returns Observation, Reward, Done, Info

Task Flow

1. Client POST /reset with task_id (easy/medium/hard)
2. Environment generates artifacts with planted bug
3. Client POST /step with action
4. Environment processes action, returns observation
5. Agent investigates until diagnosis submitted
6. Grader scores against planted bug (0.0 - 1.0)

Spaces:

Angelgupta
/

mlops-openenv

Sleeping

Backend Architecture

Project Structure

How It Works

1. Server (app.py)

2. Environment (mlops_environment.py)

3. Artifact Generator (artifact_generator.py)

4. Inference Agent (inference.py)

API Flow

Task Flow

Data Models

Action Types

Reward Structure