Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,19 +1,56 @@
|
|
| 1 |
---
|
| 2 |
title: SimpleClean
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: docker
|
| 7 |
app_port: 8501
|
| 8 |
tags:
|
| 9 |
-
- streamlit
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
pinned: false
|
| 11 |
-
short_description:
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: SimpleClean
|
| 3 |
+
emoji: 🧹
|
| 4 |
+
colorFrom: yellow
|
| 5 |
+
colorTo: pink
|
| 6 |
sdk: docker
|
| 7 |
app_port: 8501
|
| 8 |
tags:
|
| 9 |
+
- streamlit
|
| 10 |
+
- data-cleaning
|
| 11 |
+
- preprocessing
|
| 12 |
+
- imputation
|
| 13 |
+
- encoding
|
| 14 |
pinned: false
|
| 15 |
+
short_description: Clean your data interactively — no code required.
|
| 16 |
---
|
| 17 |
|
| 18 |
+
# SimpleClean
|
| 19 |
|
| 20 |
+
Interactive Streamlit dashboard to clean and preprocess your datasets: handle missing values, encode categories, scale features, remove duplicates.
|
| 21 |
|
| 22 |
+
## Author
|
| 23 |
+
Eduardo Nacimiento García
|
| 24 |
+
📧 enacimie@ull.edu.es
|
| 25 |
+
📜 Apache 2.0 License
|
| 26 |
+
|
| 27 |
+
## Features
|
| 28 |
+
- Upload CSV or use built-in demo dataset
|
| 29 |
+
- Data quality report: missing values, duplicates, data types
|
| 30 |
+
- Interactive cleaning:
|
| 31 |
+
- 🧹 Remove duplicate rows
|
| 32 |
+
- 🩹 Impute missing values (Mean, Median, Mode, Constant, KNN)
|
| 33 |
+
- 🔠 Encode categorical variables (Label Encoding, One-Hot Encoding)
|
| 34 |
+
- 📏 Scale numeric variables (StandardScaler, MinMaxScaler)
|
| 35 |
+
- Visualize missing data with Plotly
|
| 36 |
+
- Download cleaned dataset as CSV
|
| 37 |
+
- Reset to original anytime
|
| 38 |
+
|
| 39 |
+
## Demo Dataset
|
| 40 |
+
Includes sample data with:
|
| 41 |
+
- Numeric columns: Age, Income, Satisfaction
|
| 42 |
+
- Categorical columns: City, Gender, Has_Children
|
| 43 |
+
- Intentional missing values and duplicates
|
| 44 |
+
|
| 45 |
+
## Deployment
|
| 46 |
+
Ready for [Hugging Face Spaces](https://huggingface.co/spaces) (free tier).
|
| 47 |
+
|
| 48 |
+
> ⚠️ Uses `sdk: docker` — include `Dockerfile`.
|
| 49 |
+
|
| 50 |
+
## Requirements
|
| 51 |
+
- Python 3.8+
|
| 52 |
+
- Streamlit, pandas, numpy, scikit-learn, plotly
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
+
|
| 56 |
+
💡 Tip: Clean step-by-step → preview changes → download when ready!
|