Spaces:

enacimie
/

SimpleClean

Sleeping

App Files Files Community

SimpleClean / README.md

enacimie

Update README.md

3877afb verified 4 months ago

preview code

raw

history blame contribute delete

1.49 kB

metadata

title: SimpleClean
emoji: 🧹
colorFrom: yellow
colorTo: pink
sdk: docker
app_port: 8501
tags:
  - streamlit
  - data-cleaning
  - preprocessing
  - imputation
  - encoding
pinned: false
short_description: Clean your data interactively — no code required.

SimpleClean

Interactive Streamlit dashboard to clean and preprocess your datasets: handle missing values, encode categories, scale features, remove duplicates.

Author

Eduardo Nacimiento García
📧 enacimie@ull.edu.es
📜 Apache 2.0 License

Features

Upload CSV or use built-in demo dataset
Data quality report: missing values, duplicates, data types
Interactive cleaning:
- 🧹 Remove duplicate rows
- 🩹 Impute missing values (Mean, Median, Mode, Constant, KNN)
- 🔠 Encode categorical variables (Label Encoding, One-Hot Encoding)
- 📏 Scale numeric variables (StandardScaler, MinMaxScaler)
Visualize missing data with Plotly
Download cleaned dataset as CSV
Reset to original anytime

Demo Dataset

Includes sample data with:

Numeric columns: Age, Income, Satisfaction
Categorical columns: City, Gender, Has_Children
Intentional missing values and duplicates

Deployment

Ready for Hugging Face Spaces (free tier).

⚠️ Uses sdk: docker — include Dockerfile.

Requirements

Python 3.8+
Streamlit, pandas, numpy, scikit-learn, plotly

💡 Tip: Clean step-by-step → preview changes → download when ready!