Spaces:
Sleeping
Sleeping
Commit ·
661c21e
1
Parent(s): 39e56b0
Add comprehensive README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,92 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# AutoML Project
|
| 2 |
+
|
| 3 |
+
## Overview
|
| 4 |
+
|
| 5 |
+
This project is a comprehensive Automated Machine Learning (AutoML) platform designed to streamline the machine learning workflow from data preparation to model deployment. It integrates various functionalities including automated data cleaning, supervised and unsupervised learning model training, an AI-powered SQL assistant, and an interactive web-based frontend for user interaction and visualization.
|
| 6 |
+
|
| 7 |
+
## Features
|
| 8 |
+
|
| 9 |
+
* **Automated Data Cleaning:** Utilities to preprocess and clean raw datasets, ensuring data quality for model training.
|
| 10 |
+
* **Supervised Learning Models:** Implementation and integration of various supervised machine learning algorithms.
|
| 11 |
+
* **Unsupervised Learning Models:** Support for unsupervised learning techniques for tasks like clustering and dimensionality reduction.
|
| 12 |
+
* **AI SQL Assistant:** A Retrieval Augmented Generation (RAG) based AI assistant to help with SQL queries and database interactions.
|
| 13 |
+
* **Interactive Web Frontend:** A user-friendly web interface built with HTML, CSS, and JavaScript for interacting with the AutoML functionalities and visualizing results.
|
| 14 |
+
* **Data Visualization:** Tools to generate insightful charts and graphs from processed data and model outputs.
|
| 15 |
+
|
| 16 |
+
## Project Structure
|
| 17 |
+
|
| 18 |
+
The project is organized into the following main directories:
|
| 19 |
+
|
| 20 |
+
* `.env`: Environment variables, including API keys.
|
| 21 |
+
* `app.py`: The main application entry point.
|
| 22 |
+
* `config.py`: Configuration settings for the application.
|
| 23 |
+
* `frontend/`: Contains the static files for the web-based user interface (HTML, CSS, JavaScript, images).
|
| 24 |
+
* `models/`: Houses the implementations for supervised and unsupervised machine learning models.
|
| 25 |
+
* `rag/`: Contains modules related to the Retrieval Augmented Generation (RAG) system, including memory management and query processing.
|
| 26 |
+
* `utils/`: Utility functions for data cleaning, metrics calculation, and other common tasks.
|
| 27 |
+
* `visuals/`: Modules dedicated to data visualization and chart generation.
|
| 28 |
+
|
| 29 |
+
## Installation
|
| 30 |
+
|
| 31 |
+
To set up the project locally, follow these steps:
|
| 32 |
+
|
| 33 |
+
1. **Clone the repository:**
|
| 34 |
+
```bash
|
| 35 |
+
git clone https://github.com/Al1Abdullah/AutoML.git
|
| 36 |
+
cd AutoML
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
2. **Create a virtual environment (recommended):**
|
| 40 |
+
```bash
|
| 41 |
+
python -m venv venv
|
| 42 |
+
# On Windows
|
| 43 |
+
.\venv\Scripts\activate
|
| 44 |
+
# On macOS/Linux
|
| 45 |
+
source venv/bin/activate
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
3. **Install dependencies:**
|
| 49 |
+
```bash
|
| 50 |
+
pip install -r requirements.txt
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
4. **Configure API Keys:**
|
| 54 |
+
Create or update the `.env` file in the root directory with your Groq API key:
|
| 55 |
+
```
|
| 56 |
+
GROQ_API_KEY="YOUR_GROQ_API_KEY_HERE"
|
| 57 |
+
```
|
| 58 |
+
Similarly, update `groq_config.json` with your Groq API key:
|
| 59 |
+
```json
|
| 60 |
+
{
|
| 61 |
+
"GROQ_API_KEY": "YOUR_GROQ_API_KEY_HERE"
|
| 62 |
+
}
|
| 63 |
+
```
|
| 64 |
+
**Note:** Replace `"YOUR_GROQ_API_KEY_HERE"` with your actual Groq API key. Do not commit your actual API keys to version control.
|
| 65 |
+
|
| 66 |
+
## Usage
|
| 67 |
+
|
| 68 |
+
To run the AutoML application:
|
| 69 |
+
|
| 70 |
+
1. **Activate your virtual environment** (if not already active).
|
| 71 |
+
2. **Run the main application file:**
|
| 72 |
+
```bash
|
| 73 |
+
python app.py
|
| 74 |
+
```
|
| 75 |
+
This will start the Gradio application, and you can access the frontend through the URL provided in your terminal (usually `http://127.0.0.1:7860` or similar).
|
| 76 |
+
|
| 77 |
+
## Technologies Used
|
| 78 |
+
|
| 79 |
+
* **Python:** Core programming language.
|
| 80 |
+
* **Gradio:** For building the interactive web interface.
|
| 81 |
+
* **HTML, CSS, JavaScript:** For the frontend development.
|
| 82 |
+
* **Git:** Version control.
|
| 83 |
+
* **Groq API:** For AI-powered functionalities (e.g., SQL assistant).
|
| 84 |
+
* **CatBoost:** (Implied by `catboost_info`) A machine learning library.
|
| 85 |
+
|
| 86 |
+
## Contributing
|
| 87 |
+
|
| 88 |
+
Contributions are welcome! Please feel free to fork the repository, create a new branch, and submit a pull request for any improvements or bug fixes.
|
| 89 |
+
|
| 90 |
+
## License
|
| 91 |
+
|
| 92 |
+
This project is licensed under the MIT License. See the `LICENSE` file for more details (if applicable).
|