Spaces:

Al1Abdullah
/

AutoML

Sleeping

App Files Files Community

AutoML / README.md

Al1Abdullah

Update README.md

a61f5b3 verified 6 months ago

preview code

raw

history blame contribute delete

4.75 kB

	---
	title: AutoML
	sdk: docker
	emoji: 🚀
	colorFrom: red
	colorTo: purple
	---
	# AutoML Project

	## Overview

	This project is a comprehensive Automated Machine Learning (AutoML) platform designed to streamline the machine learning workflow for CSV-formatted datasets, particularly catering to students and researchers who need a rapid system for demo sessions in their data science and AI projects. It integrates various functionalities including automated data cleaning, supervised and unsupervised learning model training, an AI-powered data assistant, and an interactive web-based frontend for user interaction and visualization.

	## Features

	* Automated Data Cleaning: Utilities to preprocess and clean raw datasets, ensuring data quality for model training.
	* Supervised Learning Models: Implementation and integration of various supervised machine learning algorithms.
	* Unsupervised Learning Models: Support for unsupervised learning techniques for tasks like clustering and dimensionality reduction.
	* AI Data Assistant (Agentic Capability): A Retrieval Augmented Generation (RAG) based AI assistant designed to help users interact with and understand their CSV datasets. This component demonstrates agentic capabilities by intelligently processing natural language queries, retrieving relevant information from the dataset, and assisting with data exploration and analysis.
	* Interactive Web Frontend: A user-friendly web interface built with HTML, CSS, and JavaScript for interacting with the AutoML functionalities and visualizing results.
	* Data Visualization: Tools to generate insightful charts and graphs from processed data and model outputs.

	## Project Structure

	The project is organized into the following main directories:

	* `.env`: Environment variables, including API keys.
	* `app.py`: The main application entry point.
	* `config.py`: Configuration settings for the application.
	* `frontend/`: Contains the static files for the web-based user interface (HTML, CSS, JavaScript, images).
	* `models/`: Houses the implementations for supervised and unsupervised machine learning models.
	* `rag/`: Contains modules related to the Retrieval Augmented Generation (RAG) system, including memory management and query processing.
	* `utils/`: Utility functions for data cleaning, metrics calculation, and other common tasks.
	* `visuals/`: Modules dedicated to data visualization and chart generation.

	## Installation

	To set up the project locally, follow these steps:

	1. Clone the repository:
	```bash
	git clone https://github.com/Al1Abdullah/AutoML.git
	cd AutoML
	```

	2. Create a virtual environment (recommended):
	```bash
	python -m venv venv
	# On Windows
	.\venv\Scripts\activate
	# On macOS/Linux
	source venv/bin/activate
	```

	3. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	4. Configure API Keys:
	Create or update the `.env` file in the root directory with your Groq API key:
	```
	GROQ_API_KEY="YOUR_GROQ_API_KEY_HERE"
	```
	Similarly, update `groq_config.json` with your Groq API key:
	```json
	{
	"GROQ_API_KEY": "YOUR_GROQ_API_KEY_HERE"
	}
	```
	Note: Replace `"YOUR_GROQ_API_KEY_HERE"` with your actual Groq API key. Do not commit your actual API keys to version control.

	## Usage

	To run the AutoML application:

	1. Activate your virtual environment (if not already active).
	2. Run the main application file:
	```bash
	python app.py
	```
	(Further instructions on how to access the web frontend would depend on how `app.py` serves it. If it's a Flask/Django app, it would typically mention a local server address.)

	## Technologies Used

	* Python: Core programming language.
	* HTML, CSS, JavaScript: For the frontend development.
	* Git: Version control.
	* Groq API: For AI-powered functionalities (e.g., data assistant).
	* CatBoost: (Implied by `catboost_info`) A machine learning library.

	## Future Enhancements (Autonomous System Potential)

	The architecture of this project, particularly the RAG-based AI Data Assistant, lays the groundwork for developing more autonomous capabilities. Future enhancements could involve integrating more complex decision-making processes, self-correction mechanisms, and broader task automation, moving towards a more fully autonomous AutoML system.

	## Contributing

	Contributions are welcome! Please feel free to fork the repository, create a new branch, and submit a pull request for any improvements or bug fixes.

	## License

	This project is licensed under the MIT License. See the `LICENSE` file for more details (if applicable).