AutoML / README.md
Al1Abdullah's picture
Update README.md
a61f5b3 verified
---
title: AutoML
sdk: docker
emoji: ๐Ÿš€
colorFrom: red
colorTo: purple
---
# AutoML Project
## Overview
This project is a comprehensive Automated Machine Learning (AutoML) platform designed to streamline the machine learning workflow for **CSV-formatted datasets**, particularly catering to students and researchers who need a rapid system for **demo sessions** in their data science and AI projects. It integrates various functionalities including automated data cleaning, supervised and unsupervised learning model training, an AI-powered data assistant, and an interactive web-based frontend for user interaction and visualization.
## Features
* **Automated Data Cleaning:** Utilities to preprocess and clean raw datasets, ensuring data quality for model training.
* **Supervised Learning Models:** Implementation and integration of various supervised machine learning algorithms.
* **Unsupervised Learning Models:** Support for unsupervised learning techniques for tasks like clustering and dimensionality reduction.
* **AI Data Assistant (Agentic Capability):** A Retrieval Augmented Generation (RAG) based AI assistant designed to help users interact with and understand their **CSV datasets**. This component demonstrates agentic capabilities by intelligently processing natural language queries, retrieving relevant information from the dataset, and assisting with data exploration and analysis.
* **Interactive Web Frontend:** A user-friendly web interface built with HTML, CSS, and JavaScript for interacting with the AutoML functionalities and visualizing results.
* **Data Visualization:** Tools to generate insightful charts and graphs from processed data and model outputs.
## Project Structure
The project is organized into the following main directories:
* `.env`: Environment variables, including API keys.
* `app.py`: The main application entry point.
* `config.py`: Configuration settings for the application.
* `frontend/`: Contains the static files for the web-based user interface (HTML, CSS, JavaScript, images).
* `models/`: Houses the implementations for supervised and unsupervised machine learning models.
* `rag/`: Contains modules related to the Retrieval Augmented Generation (RAG) system, including memory management and query processing.
* `utils/`: Utility functions for data cleaning, metrics calculation, and other common tasks.
* `visuals/`: Modules dedicated to data visualization and chart generation.
## Installation
To set up the project locally, follow these steps:
1. **Clone the repository:**
```bash
git clone https://github.com/Al1Abdullah/AutoML.git
cd AutoML
```
2. **Create a virtual environment (recommended):**
```bash
python -m venv venv
# On Windows
.\venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
```
3. **Install dependencies:**
```bash
pip install -r requirements.txt
```
4. **Configure API Keys:**
Create or update the `.env` file in the root directory with your Groq API key:
```
GROQ_API_KEY="YOUR_GROQ_API_KEY_HERE"
```
Similarly, update `groq_config.json` with your Groq API key:
```json
{
"GROQ_API_KEY": "YOUR_GROQ_API_KEY_HERE"
}
```
**Note:** Replace `"YOUR_GROQ_API_KEY_HERE"` with your actual Groq API key. Do not commit your actual API keys to version control.
## Usage
To run the AutoML application:
1. **Activate your virtual environment** (if not already active).
2. **Run the main application file:**
```bash
python app.py
```
(Further instructions on how to access the web frontend would depend on how `app.py` serves it. If it's a Flask/Django app, it would typically mention a local server address.)
## Technologies Used
* **Python:** Core programming language.
* **HTML, CSS, JavaScript:** For the frontend development.
* **Git:** Version control.
* **Groq API:** For AI-powered functionalities (e.g., data assistant).
* **CatBoost:** (Implied by `catboost_info`) A machine learning library.
## Future Enhancements (Autonomous System Potential)
The architecture of this project, particularly the RAG-based AI Data Assistant, lays the groundwork for developing more autonomous capabilities. Future enhancements could involve integrating more complex decision-making processes, self-correction mechanisms, and broader task automation, moving towards a more fully autonomous AutoML system.
## Contributing
Contributions are welcome! Please feel free to fork the repository, create a new branch, and submit a pull request for any improvements or bug fixes.
## License
This project is licensed under the MIT License. See the `LICENSE` file for more details (if applicable).