Spaces:

abhiraj12
/

Auto_ML

Paused

App Files Files Community

Auto_ML / README.md

abhiraj12

added features

1120492 about 1 month ago

preview code

raw

history blame contribute delete

5.74 kB

metadata

title: AutoML Studio
emoji: ✨
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false

🎨 AutoML Studio

This repository is configured to run on Hugging Face Spaces as a single Docker Space that serves:

Streamlit frontend through Nginx on public port 7860
FastAPI backend internally on 127.0.0.1:8000
Celery worker plus Redis inside the same container for background training jobs

AutoML Studio is a high-performance, intelligent end-to-end automated machine learning platform. It empowers anyone to upload tabular datasets, leverage "Dataset DNA" heuristics to automatically preprocess data, logically isolate the most appropriate algorithms, and train competitive ML models through a multi-page Streamlit frontend paired with a FastAPI backend.

(Placeholder: Add a screenshot or GIF of the dashboard here)

✨ Features

Dataset DNA Analyzer: Instantly parses uploaded datasets (CSV, JSON, Excel, Parquet) to automatically determine shape, calculate missing value distributions, identify imbalances, and heuristically suggest target configurations.
Auto-Imputing & Auto-Encoding: You never have to manually clean data again. The backend seamlessly applies ColumnTransformers, routing numeric data through Medians/StandardScalers and categorical data through Constant/OneHotEncoders safely.
Smart Model Selection (Pro Mode): It doesn't test blindly. It evaluates the exact shape and taxonomy of your dataset to dynamically build a tailored algorithmic roster (e.g. leveraging SVM for small datasets, and unleashing XGBoost for high-dimensional complexity).
Time Travel Training Logs: View live metric updates as pipelines iteratively optimize.
Auto Report (Story Mode): Generates an automated "wrap-up" narrative explaining what data was analyzed, which algorithm dominated, and why it succeeded.
Deep Insights: Explore exactly where the model fails via the Mistake Analyzer, view low-confidence classifications, and receive "Explain-Like-I'm-5" ML coaching strategies.
One-Click Deploy Bundles: Automatically bundles and exports your trained .pkl model directly beside a custom-written FastAPI script, giving you a deployment-ready inference server in 1 click!

🏗️ Architecture

AutoML Studio
├── frontend/
│   ├── app.py                 # Streamlit entry point
│   ├── pages/                 # Streamlit workflows
│   ├── style.css              # Shared visual system
│   └── ui_shell.py            # Shared UI helpers
├── backend/
│   ├── main.py               # FastAPI entry point
│   └── core/
│       ├── data_profiler.py  # Dataset heuristic extraction logic
│       ├── insights.py       # Narrative generation and AI coaching synthesis
│       └── export.py         # ZIP creation for trained model bundles
├── requirements.txt          # Shared dependencies
├── start.sh                  # Docker / HF launcher
├── run.sh                    # Local development launcher
└── README.md

🚀 Installation & Usage

Prerequisites

Python 3.8+
pip package manager

1. Setup Environment

Clone the repository and set up a virtual environment:

python3 -m venv venv
source venv/bin/activate

2. Install Dependencies

pip install -r requirements.txt

3. Quick Start (Recommended)

You can launch the FastAPI backend, worker stack, and Streamlit frontend using the provided shell script:

bash run.sh

4. Manual Launch

If you prefer to run it manually:

Start the backend:

cd backend
uvicorn main:app --host 0.0.0.0 --port 8000

Open the frontend: http://localhost:8501

📖 How to Use

Once the application is live on http://localhost:8501, follow these steps:

Upload Dataset: Navigate to the Home tab and drag-and-drop your dataset (CSV, JSON, Excel, or Parquet).
Review DNA: Click on the DNA tab to review the automatic imputation plan and exploratory data analysis.
Train Engine: Go to Training & Results to start the parallel training pipeline. Watch the time-travel metrics update live.
Export: Once training completes, download the deployment-ready .zip bundle to serve your model immediately.

🌐 Deployment Options

Docker

# Build and run with Docker Compose
docker-compose up -d

# Access the app through the container's configured public port

Production Considerations

Database: Add PostgreSQL for production data persistence
Redis: Required for background job queuing
Storage: Use cloud storage (S3, GCS) for large model files
Scaling: Consider load balancer for multiple instances
Security: Add authentication, rate limiting, and input validation

Configuration Notes

PORT controls the public Nginx listener. Default is 7860.
AUTOML_ALLOWED_ORIGINS accepts a comma-separated CORS allowlist for the FastAPI backend.
STREAMLIT_ENABLE_CORS and STREAMLIT_ENABLE_XSRF_PROTECTION let you tighten frontend security for non-HF deployments.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request or open an Issue for any bugs, feature requests, or improvements.

📄 License

This project is licensed under the MIT License.

Spaces:

abhiraj12
/

Auto_ML

Paused

🎨 AutoML Studio

✨ Features

🏗️ Architecture

🚀 Installation & Usage

Prerequisites

1. Setup Environment

2. Install Dependencies

3. Quick Start (Recommended)

4. Manual Launch

📖 How to Use

🌐 Deployment Options

Docker

Production Considerations

Configuration Notes

🤝 Contributing

📄 License

auto_ml

auto_ml

auto_ml