Spaces:
Sleeping
Sleeping
File size: 1,485 Bytes
c79f7ff f951806 3877afb c79f7ff 3877afb c79f7ff 3877afb c79f7ff 3877afb c79f7ff 3877afb c79f7ff 3877afb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
title: SimpleClean
emoji: 🧹
colorFrom: yellow
colorTo: pink
sdk: docker
app_port: 8501
tags:
- streamlit
- data-cleaning
- preprocessing
- imputation
- encoding
pinned: false
short_description: Clean your data interactively — no code required.
---
# SimpleClean
Interactive Streamlit dashboard to clean and preprocess your datasets: handle missing values, encode categories, scale features, remove duplicates.
## Author
Eduardo Nacimiento García
📧 enacimie@ull.edu.es
📜 Apache 2.0 License
## Features
- Upload CSV or use built-in demo dataset
- Data quality report: missing values, duplicates, data types
- Interactive cleaning:
- 🧹 Remove duplicate rows
- 🩹 Impute missing values (Mean, Median, Mode, Constant, KNN)
- 🔠 Encode categorical variables (Label Encoding, One-Hot Encoding)
- 📏 Scale numeric variables (StandardScaler, MinMaxScaler)
- Visualize missing data with Plotly
- Download cleaned dataset as CSV
- Reset to original anytime
## Demo Dataset
Includes sample data with:
- Numeric columns: Age, Income, Satisfaction
- Categorical columns: City, Gender, Has_Children
- Intentional missing values and duplicates
## Deployment
Ready for [Hugging Face Spaces](https://huggingface.co/spaces) (free tier).
> ⚠️ Uses `sdk: docker` — include `Dockerfile`.
## Requirements
- Python 3.8+
- Streamlit, pandas, numpy, scikit-learn, plotly
---
💡 Tip: Clean step-by-step → preview changes → download when ready! |