Spaces:

sairaj2
/

AutoClean-Ai

Sleeping

App Files Files Community

AutoClean-Ai / README.md

sairaj2

Upload folder using huggingface_hub

e23a276 verified 4 days ago

preview code

raw

history blame contribute delete

26.3 kB

	---
	title: AutoClean-Ai
	emoji: 🧹
	colorFrom: green
	colorTo: blue
	sdk: docker
	app_port: 7860
	pinned: true
	tags:
	- openenv
	- reinforcement-learning
	- data-cleaning
	- data-preprocessing
	- llm-training
	- benchmark
	- ai-safety
	- data-quality
	- mlops
	base_path: /web
	---

	# 🧹 AutoClean-Ai

	> Production-grade OpenEnv RL environment for training AI models to clean tabular data automatically.

	Server Version: v1.0.0

	[![OpenEnv](https://img.shields.io/badge/OpenEnv-Compatible-blue)](https://github.com/meta-pytorch/OpenEnv)
	[![Python](https://img.shields.io/badge/Python-3.10%20%7C%203.11%20%7C%203.12-blue)](#-quick-start)
	[![License](https://img.shields.io/badge/License-MIT-green)](LICENSE)
	[![Dataset](https://img.shields.io/badge/Dataset-Realistic%20Generated-orange)](#-datasets)

	---

	## 💡 The Problem

	80% of data scientist time is spent cleaning data. Bad data causes 60% of ML project failures. AutoClean-Ai was built to train AI agents that can automatically detect and fix common data quality issues in tabular datasets.

	## 🚀 Quick Start

	### Run Locally

	```bash
	git clone https://github.com/SairajMN/WorkflowOps.git
	cd WorkflowOps
	pip install -e .
	uvicorn server.app:app --host 0.0.0.0 --port 7860
	curl http://localhost:7860/health
	```

	### Raw HTTP

	```python
	import requests

	BASE = "http://localhost:7860"

	# 1. Start episode
	obs = requests.post(f"{BASE}/reset", json={"difficulty": "beginner"}).json()
	print(obs["dataset_preview"], obs["column_info"])

	# 2. Submit cleaning action
	result = requests.post(f"{BASE}/step", json={
	"action_type": "fix_missing_values",
	"column_index": 2,
	"confidence": 0.92,
	"reasoning": "Mean imputation for numerical column",
	"session_id": obs.get("session_id"),
	}).json()
	print(f"Reward: {result['reward']}, Cleaned: {result['rows_cleaned']}")

	# 3. Score the episode
	grade = requests.post(f"{BASE}/grader", json={
	"task_id": "task_1_basic_cleaning",
	"step_rewards": [result['reward']],
	"step_infos": [result],
	}).json()
	print(f"Episode score: {grade['score']}")
	```

	### Validate OpenEnv Compliance

	```bash
	# Local structure check
	openenv validate

	# Runtime check against live server
	openenv validate --url http://localhost:7860 --verbose
	```

	---
	```bash
	python3 inference.py

	2026-04-12 22:19:47,173 [INFO] Connecting to environment: https://sairaj2-openenv-datacleaner.hf.space
	2026-04-12 22:19:49,338 [INFO] Environment: AutoClean-AI v1.0.0 — healthy
	2026-04-12 22:19:49,711 [INFO] Available tasks: ['easy_001', 'medium_001', 'hard_001', 'employee_demo']
	2026-04-12 22:19:49,711 [INFO] Using LLM agent: qwen/qwen3-next-80b-a3b-instruct:free via https://openrouter.ai/api/v1
	2026-04-12 22:19:50,044 [INFO]
	=======================================================
	2026-04-12 22:19:50,044 [INFO] TASK: easy_001 (difficulty=beginner)
	2026-04-12 22:19:50,044 [INFO] =======================================================
	[START] task=easy_001 env=openenv-datacleaner model=qwen/qwen3-next-80b-a3b-instruct:free
	2026-04-12 22:19:52,471 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:19:52,472 [INFO] Retrying request to /chat/completions in 0.464138 seconds
	2026-04-12 22:19:53,580 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:19:53,580 [INFO] Retrying request to /chat/completions in 0.815704 seconds
	2026-04-12 22:19:55,038 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:19:55,041 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': 'qwen/qwen3-next-80b-a3b-instruct:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations', 'provider_name': 'Venice', 'is_byok': False}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=remove_duplicates reward=0.50 done=false error=null
	2026-04-12 22:19:55,383 [INFO] [easy_001] ep=1 step=1 reward=0.500
	2026-04-12 22:19:55,965 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:19:55,967 [INFO] Retrying request to /chat/completions in 0.458206 seconds
	2026-04-12 22:19:57,083 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 402 Payment Required"
	2026-04-12 22:19:57,084 [WARNING] LLM call failed: Error code: 402 - {'error': {'message': 'Provider returned error', 'code': 402, 'metadata': {'raw': '{"error":"API key USD spend limit exceeded. Your account may still have USD balance, but this API key has reached its configured USD spending limit."}', 'provider_name': 'Venice', 'is_byok': False}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=2 action=submit reward=1.00 done=true error=null
	2026-04-12 22:19:57,485 [INFO] [easy_001] ep=1 step=2 reward=1.000
	2026-04-12 22:19:58,443 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:19:58,445 [INFO] Retrying request to /chat/completions in 0.475367 seconds
	2026-04-12 22:19:59,627 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:19:59,628 [INFO] Retrying request to /chat/completions in 0.844512 seconds
	2026-04-12 22:20:01,065 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:01,067 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': 'qwen/qwen3-next-80b-a3b-instruct:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations', 'provider_name': 'Venice', 'is_byok': False}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=remove_duplicates reward=0.50 done=false error=null
	2026-04-12 22:20:01,372 [INFO] [easy_001] ep=2 step=1 reward=0.500
	2026-04-12 22:20:01,969 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:01,971 [INFO] Retrying request to /chat/completions in 0.387579 seconds
	2026-04-12 22:20:03,191 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:03,193 [INFO] Retrying request to /chat/completions in 0.930048 seconds
	2026-04-12 22:20:04,715 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:04,717 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': 'qwen/qwen3-next-80b-a3b-instruct:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations', 'provider_name': 'Venice', 'is_byok': False}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=2 action=submit reward=1.00 done=true error=null
	2026-04-12 22:20:05,054 [INFO] [easy_001] ep=2 step=2 reward=1.000
	2026-04-12 22:20:06,558 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:06,560 [INFO] Retrying request to /chat/completions in 0.377761 seconds
	2026-04-12 22:20:08,138 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:08,139 [INFO] Retrying request to /chat/completions in 0.790773 seconds
	2026-04-12 22:20:09,531 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:09,533 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': 'qwen/qwen3-next-80b-a3b-instruct:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations', 'provider_name': 'Venice', 'is_byok': False}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=remove_duplicates reward=0.50 done=false error=null
	2026-04-12 22:20:09,877 [INFO] [easy_001] ep=3 step=1 reward=0.500
	2026-04-12 22:20:10,478 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:10,480 [INFO] Retrying request to /chat/completions in 0.432287 seconds
	2026-04-12 22:20:11,245 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:11,247 [INFO] Retrying request to /chat/completions in 0.841678 seconds
	2026-04-12 22:20:12,445 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:12,447 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=2 action=submit reward=1.00 done=true error=null
	2026-04-12 22:20:12,771 [INFO] [easy_001] ep=3 step=2 reward=1.000
	[END] success=true steps=6 score=0.750 rewards=0.50,1.00,0.50,1.00,0.50,1.00
	2026-04-12 22:20:12,771 [INFO]
	Task score: 0.7500 ± 0.0000
	2026-04-12 22:20:12,771 [INFO]
	=======================================================
	2026-04-12 22:20:12,771 [INFO] TASK: medium_001 (difficulty=intermediate)
	2026-04-12 22:20:12,771 [INFO] =======================================================
	[START] task=medium_001 env=openenv-datacleaner model=qwen/qwen3-next-80b-a3b-instruct:free
	2026-04-12 22:20:13,504 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:13,504 [INFO] Retrying request to /chat/completions in 0.469513 seconds
	2026-04-12 22:20:14,323 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:14,323 [INFO] Retrying request to /chat/completions in 0.933486 seconds
	2026-04-12 22:20:16,371 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:16,371 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=submit reward=0.50 done=true error=null
	2026-04-12 22:20:16,811 [INFO] [medium_001] ep=1 step=1 reward=0.500
	2026-04-12 22:20:17,561 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:17,562 [INFO] Retrying request to /chat/completions in 0.445498 seconds
	2026-04-12 22:20:18,419 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:18,421 [INFO] Retrying request to /chat/completions in 0.807103 seconds
	2026-04-12 22:20:19,640 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:19,641 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=submit reward=0.50 done=true error=null
	2026-04-12 22:20:19,980 [INFO] [medium_001] ep=2 step=1 reward=0.500
	2026-04-12 22:20:20,626 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:20,627 [INFO] Retrying request to /chat/completions in 0.397460 seconds
	2026-04-12 22:20:21,491 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:21,493 [INFO] Retrying request to /chat/completions in 0.964606 seconds
	2026-04-12 22:20:22,821 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:22,823 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=submit reward=0.50 done=true error=null
	2026-04-12 22:20:23,198 [INFO] [medium_001] ep=3 step=1 reward=0.500
	[END] success=true steps=3 score=0.500 rewards=0.50,0.50,0.50
	2026-04-12 22:20:23,199 [INFO]
	Task score: 0.5000 ± 0.0000
	2026-04-12 22:20:23,199 [INFO]
	=======================================================
	2026-04-12 22:20:23,199 [INFO] TASK: hard_001 (difficulty=advanced)
	2026-04-12 22:20:23,199 [INFO] =======================================================
	[START] task=hard_001 env=openenv-datacleaner model=qwen/qwen3-next-80b-a3b-instruct:free
	2026-04-12 22:20:24,051 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:24,052 [INFO] Retrying request to /chat/completions in 0.472201 seconds
	2026-04-12 22:20:25,173 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:25,174 [INFO] Retrying request to /chat/completions in 0.768212 seconds
	2026-04-12 22:20:26,285 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:26,286 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=remove_duplicates reward=0.50 done=false error=null
	2026-04-12 22:20:26,614 [INFO] [hard_001] ep=1 step=1 reward=0.500
	2026-04-12 22:20:27,026 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:27,026 [INFO] Retrying request to /chat/completions in 0.446455 seconds
	2026-04-12 22:20:28,422 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:28,424 [INFO] Retrying request to /chat/completions in 0.765570 seconds
	2026-04-12 22:20:29,526 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:29,527 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=2 action=submit reward=1.00 done=true error=null
	2026-04-12 22:20:29,927 [INFO] [hard_001] ep=1 step=2 reward=1.000
	2026-04-12 22:20:30,587 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:30,589 [INFO] Retrying request to /chat/completions in 0.408676 seconds
	2026-04-12 22:20:31,424 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:31,426 [INFO] Retrying request to /chat/completions in 0.778604 seconds
	2026-04-12 22:20:32,608 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:32,611 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=remove_duplicates reward=0.50 done=false error=null
	2026-04-12 22:20:33,065 [INFO] [hard_001] ep=2 step=1 reward=0.500
	2026-04-12 22:20:33,472 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:33,473 [INFO] Retrying request to /chat/completions in 0.458515 seconds
	2026-04-12 22:20:34,394 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:34,395 [INFO] Retrying request to /chat/completions in 0.825773 seconds
	2026-04-12 22:20:35,545 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:35,547 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=2 action=submit reward=1.00 done=true error=null
	2026-04-12 22:20:35,874 [INFO] [hard_001] ep=2 step=2 reward=1.000
	2026-04-12 22:20:36,572 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:36,573 [INFO] Retrying request to /chat/completions in 0.417865 seconds
	2026-04-12 22:20:37,307 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:37,309 [INFO] Retrying request to /chat/completions in 0.985335 seconds
	2026-04-12 22:20:38,616 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:38,618 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=1 action=remove_duplicates reward=0.50 done=false error=null
	2026-04-12 22:20:38,959 [INFO] [hard_001] ep=3 step=1 reward=0.500
	2026-04-12 22:20:39,310 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:39,311 [INFO] Retrying request to /chat/completions in 0.375729 seconds
	2026-04-12 22:20:40,045 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:40,046 [INFO] Retrying request to /chat/completions in 0.926493 seconds
	2026-04-12 22:20:41,322 [INFO] HTTP Request: POST https://openrouter.ai/api/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
	2026-04-12 22:20:41,325 [WARNING] LLM call failed: Error code: 429 - {'error': {'message': 'Rate limit exceeded: limit_rpm/qwen/qwen3-next-80b-a3b-instruct-2509/94248808-ba97-4e3c-be60-1cb0928db51d. High demand for qwen/qwen3-next-80b-a3b-instruct:free on OpenRouter - limited to 8 requests per minute. Please retry shortly.', 'code': 429, 'metadata': {'headers': {'X-RateLimit-Limit': '8', 'X-RateLimit-Remaining': '0', 'X-RateLimit-Reset': '1776012660000'}, 'provider_name': None}}, 'user_id': 'user_36ZHxohbiGTyLfq9vP3Sf6ojZMM'}
	[STEP] step=2 action=submit reward=1.00 done=true error=null
	2026-04-12 22:20:41,690 [INFO] [hard_001] ep=3 step=2 reward=1.000
	[END] success=true steps=6 score=0.750 rewards=0.50,1.00,0.50,1.00,0.50,1.00
	2026-04-12 22:20:41,690 [INFO]
	Task score: 0.7500 ± 0.0000

	=======================================================
	INFERENCE RESULTS
	=======================================================
	Model : qwen/qwen3-next-80b-a3b-instruct:free
	Seed : 42 \| 3 episodes x 8 steps
	Elapsed : 51.6s

	easy_001 0.7500 +- 0.0000 \|############### \|
	medium_001 0.5000 +- 0.0000 \|########## \|
	hard_001 0.7500 +- 0.0000 \|############### \|

	OVERALL 0.6667
	=======================================================
	```

	---

	---

	## 🎯 Tasks

	3 progressive difficulty tasks:

	\| # \| task_id \| Difficulty \| Description \| Expected Agent Score \|
	\|---\|---------\|-----------\|-------------\|-------------------\|
	\| 1 \| `task_1_basic_cleaning` \| 🟢 Beginner \| Fix missing values, standardize formats \| 0.70–0.85 \|
	\| 2 \| `task_2_advanced_cleaning` \| 🟡 Intermediate \| Handle outliers, correct data types, deduplication \| 0.55–0.70 \|
	\| 3 \| `task_3_full_pipeline` \| 🔴 Advanced \| Complete end-to-end data cleaning pipeline \| 0.40–0.60 \|

	---

	## 🎮 Environment Workflow

	The agent receives a tabular dataset with known quality issues. It must select the appropriate cleaning operation, apply it correctly, and justify its choice.

	### Action Space

	```json
	{
	"action_type": "fix_missing_values \| remove_outliers \| standardize \| deduplicate \| correct_types \| fill_dates",
	"column_index": 3,
	"confidence": 0.85,
	"reasoning": "string explaining the choice",
	"session_id": "session id from reset"
	}
	```

	### Observation Space

	```json
	{
	"dataset_preview": "First 5 rows of data",
	"column_info": "Column names, types, missing stats",
	"reward": 0.75,
	"feedback": "Detailed human-readable feedback",
	"rows_cleaned": 12,
	"issues_remaining": 3,
	"done": false,
	"session_id": "ses_a1b2c3d4"
	}
	```

	---

	## 📊 Reward System (7 Components)

	\| Component \| Weight \| Description \|
	\|-----------\|--------\|-------------\|
	\| Correctness \| 0.35 \| Operation actually fixed the issue \|
	\| Appropriate action \| 0.25 \| Right operation selected for the problem \|
	\| Confidence calibration \| 0.15 \| Confidence matches actual correctness \|
	\| No side effects \| 0.15 \| Cleaning didn't break other columns \|
	\| Efficiency \| 0.10 \| Minimum steps to clean dataset \|

	---

	## 📈 Metrics

	✅ Data Quality Score
	✅ Completeness Ratio
	✅ Uniqueness Ratio
	✅ Type Consistency
	✅ Cleaning Efficiency
	✅ Action Appropriateness

	---

	## 📋 Supported Data Cleaning Operations

	\| Operation \| Description \|
	\|-----------\|-------------\|
	\| `fix_missing_values` \| Mean/median/mode imputation \|
	\| `remove_outliers` \| IQR / Z-score outlier removal \|
	\| `standardize` \| Normalize numerical columns \|
	\| `deduplicate` \| Remove duplicate rows \|
	\| `correct_types` \| Fix incorrect data types \|
	\| `fill_dates` \| Standardize date formats \|
	\| `handle_categories` \| Encode categorical columns \|
	\| `remove_duplicates` \| Drop identical rows \|
	\| `trim_strings` \| Clean whitespace from text columns \|
	\| `correct_values` \| Fix known invalid values \|

	---

	## 📀 API Endpoints

	### OpenEnv Required

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `GET` \| `/tasks` \| List all 3 tasks + action schema \|
	\| `POST` \| `/grader` \| Score a completed episode (0.0–1.0) \|
	\| `POST` \| `/baseline` \| Run built-in heuristic baseline \|
	\| `GET` \| `/metadata` \| Environment name, version, description \|
	\| `GET` \| `/schema` \| Action, observation, and state JSON schemas \|
	\| `GET` \| `/health` \| Health check \|

	### Environment

	\| Method \| Endpoint \| Description \|
	\|--------\|----------\|-------------\|
	\| `POST` \| `/reset` \| Start new episode \|
	\| `POST` \| `/step` \| Submit cleaning action \|
	\| `GET` \| `/state` \| Get current episode state \|

	---

	## 💻 Development

	```bash
	# Install with dev dependencies
	pip install -e ".[dev]"

	# Run tests
	pytest tests/ -v

	# Validate OpenEnv compliance
	openenv validate --url http://localhost:7860 --verbose
	```

	---

	## 🔗 Links

	\| \| \|
	\|---\|---\|
	\| 📦 GitHub \| https://github.com/SairajMN/AutoClean-AI \|
	\| 📖 Interactive API Docs \| http://localhost:7860/redoc \|
	\| 🔧 OpenEnv Framework \| https://github.com/meta-pytorch/OpenEnv \|

	---

	Built for Data Cleaning AI Agents · MIT License