metadata
title: OpenEnv Data Cleaner
emoji: 🧹
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
base_path: /web
OpenEnv Data Cleaner
An OpenEnv-compliant AI-powered data cleaning environment built on openenv-core.
Features
- OpenEnv-native: Built using
openenv-corebase classes - Data Cleaning Actions: Drop nulls, fill nulls, remove duplicates, filter rows, drop columns, convert types, validate emails, outlier removal, normalization
- Task-based Learning: Three difficulty levels (easy, medium, hard)
- Grading System: Deterministic scoring based on data quality improvements
- Reward System: Structured rewards with quality, progress, and penalty components
- Web Interface: Interactive UI for manual data cleaning
- Docker Ready: Deployable to Hugging Face Spaces
Quick Start
# Install dependencies
pip install -r requirements.txt
# Run the server
python app.py
API Endpoints
GET /- Web interfaceGET /health- Health checkPOST /reset- Initialize a new taskPOST /step- Execute a cleaning actionPOST /submit- Submit solution for gradingPOST /revert- Revert last actionGET /tasks- List available tasksGET /state- Get current environment stateGET /dataset- Get dataset informationGET /history- Get action history
Tasks
| Task ID | Difficulty | Description |
|---|---|---|
| easy_001 | Easy | Basic cleaning: drop nulls and remove duplicates |
| medium_001 | Medium | Intermediate: handle nulls, validate emails, remove outliers |
| hard_001 | Hard | Advanced: full pipeline with type conversion and normalization |
Deployment
Deploy to Hugging Face Spaces:
openenv push ./env