Spaces:
Sleeping
Sleeping
| # Backend Architecture | |
| ## Project Structure | |
| ``` | |
| MLops-Openenvhack/ | |
| βββ app.py # FastAPI server - main entry point | |
| βββ inference.py # Baseline LLM agent for evaluation | |
| βββ models.py # Pydantic models (Action, Observation, State) | |
| βββ mlops_environment.py # Core environment logic | |
| βββ artifact_generator.py # Procedural bug/artifact generation | |
| βββ client.py # Python client library | |
| βββ openenv.yaml # OpenEnv specification | |
| βββ Dockerfile # Container configuration | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # Documentation | |
| ``` | |
| ## How It Works | |
| ### 1. Server (app.py) | |
| - Runs FastAPI on port 7860 | |
| - Provides REST endpoints: | |
| - `GET /health` - Health check | |
| - `POST /reset` - Initialize new task | |
| - `POST /step` - Execute action | |
| - `GET /state` - Get current state | |
| - `GET /tasks` - List available tasks | |
| - `GET /openenv/state` - OpenEnv state | |
| ### 2. Environment (mlops_environment.py) | |
| - Manages task state | |
| - Processes actions through `_handle_*` methods | |
| - Generates rewards based on agent behavior | |
| - Tracks artifacts read and sanity checks | |
| ### 3. Artifact Generator (artifact_generator.py) | |
| - Procedurally generates training artifacts with planted bugs | |
| - Creates realistic: logs, configs, preprocessing code, eval results | |
| - Supports 9 bug types across 3 difficulty levels | |
| ### 4. Inference Agent (inference.py) | |
| - LLM-powered agent using OpenAI API | |
| - Reads artifacts, runs sanity checks | |
| - Submits diagnosis with confidence scoring | |
| - Implements rate limiting and fallback | |
| ## API Flow | |
| ``` | |
| Client -> app.py (FastAPI) | |
| | | |
| +-> mlops_environment.py (core logic) | |
| | | |
| +-> artifact_generator.py (bug generation) | |
| | | |
| +-> models.py (data validation) | |
| | | |
| +-> Returns Observation, Reward, Done, Info | |
| ``` | |
| ## Task Flow | |
| ``` | |
| 1. Client POST /reset with task_id (easy/medium/hard) | |
| 2. Environment generates artifacts with planted bug | |
| 3. Client POST /step with action | |
| 4. Environment processes action, returns observation | |
| 5. Agent investigates until diagnosis submitted | |
| 6. Grader scores against planted bug (0.0 - 1.0) | |
| ``` | |
| ## Data Models | |
| ### Action Types | |
| - read_config, read_logs, check_dataset_stats | |
| - inspect_preprocessing, read_eval_results | |
| - run_sanity_check, query_artifact | |
| - submit_diagnosis | |
| ### Reward Structure | |
| - +0.02 per new artifact read | |
| - -0.02 per duplicate read | |
| - +0.01 per new sanity check | |
| - Terminal: +0.15 category + 0.25 file + 0.30 field + 0.30 fix | |