File size: 1,760 Bytes
a3b9b4b 2cc94a4 a3b9b4b db12ca6 a3b9b4b 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 2cc94a4 56ddfd4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | ---
title: OpenEnv Data Cleaner
emoji: 🧹
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
base_path: /web
---
# OpenEnv Data Cleaner
An OpenEnv-compliant AI-powered data cleaning environment built on `openenv-core`.
## Features
- **OpenEnv-native**: Built using `openenv-core` base classes
- **Data Cleaning Actions**: Drop nulls, fill nulls, remove duplicates, filter rows, drop columns, convert types, validate emails, outlier removal, normalization
- **Task-based Learning**: Three difficulty levels (easy, medium, hard)
- **Grading System**: Deterministic scoring based on data quality improvements
- **Reward System**: Structured rewards with quality, progress, and penalty components
- **Web Interface**: Interactive UI for manual data cleaning
- **Docker Ready**: Deployable to Hugging Face Spaces
## Quick Start
```bash
# Install dependencies
pip install -r requirements.txt
# Run the server
python app.py
```
## API Endpoints
- `GET /` - Web interface
- `GET /health` - Health check
- `POST /reset` - Initialize a new task
- `POST /step` - Execute a cleaning action
- `POST /submit` - Submit solution for grading
- `POST /revert` - Revert last action
- `GET /tasks` - List available tasks
- `GET /state` - Get current environment state
- `GET /dataset` - Get dataset information
- `GET /history` - Get action history
## Tasks
| Task ID | Difficulty | Description |
|---------|------------|-------------|
| easy_001 | Easy | Basic cleaning: drop nulls and remove duplicates |
| medium_001 | Medium | Intermediate: handle nulls, validate emails, remove outliers |
| hard_001 | Hard | Advanced: full pipeline with type conversion and normalization |
## Deployment
Deploy to Hugging Face Spaces:
```bash
openenv push ./env
```
|