Spaces:
Configuration error
Configuration error
π€ Auto-FineTune-Ops
Autonomous End-to-End LLM Fine-Tuning Pipeline
From raw data to production API in one click. No ML expertise required.
π― What Is This?
Auto-FineTune-Ops is a no-code/low-code platform that automates the entire lifecycle of fine-tuning Large Language Models (LLMs). It handles:
- Data Ingestion: Upload CSV, JSON, or JSONL files.
- Advanced Preprocessing: 10+ modules for cleaning, PII redaction, deduplication, and formatting.
- Hybrid Training: Train locally on GPU (Unsloth/LoRA) or generate a Google Colab Notebook for free cloud GPU training.
- AI Judge Evaluation: Compare your fine-tuned model against the base model using GPT-4, Claude 3.5, Gemini, or Groq as a judge.
- One-Click Deployment: Export your trained model as a production-ready FastAPI endpoint.
All accessible via a premium, easy-to-use Streamlit Dashboard.
β¨ Key Features
π§ Intelligent Preprocessing
- Text Cleaning: Remove HTML, URLs, emojis, normalize whitespace.
- PII Filter: Redact emails, phone numbers, API keys.
- Deduplication: Remove exact and semantic (TF-IDF) duplicates.
- Quality Filters: Filter by length, language, toxicity.
- Balancing: Oversample/undersample classes for classification tasks.
- Export Formats: Auto-convert to OpenAI Chat, Completion, or Classification JSONL formats.
β‘ Flexible Training Workflows
- Local GPU: Uses Unsloth for ultra-fast 4-bit LoRA fine-tuning (2x faster, 70% less memory).
- Google Colab Fallback: Don't have a GPU? The app generates a ready-to-run Colab notebook for you. Download models back to the app for evaluation.
- Custom Models: Fine-tune any HuggingFace model (Llama 3, Mistral, Gemma, Phi-3, etc.).
βοΈ Multi-Provider AI Judge
Evaluate models head-to-head using:
- OpenAI (GPT-4o, GPT-4-turbo)
- Anthropic (Claude 3.5 Sonnet, Opus)
- Google (Gemini 1.5 Pro)
- Groq (Llama 3, Mixtral)
- Custom Endpoints (Ollama, vLLM)
π Quick Start
1. Installation
# Clone the repository
git clone https://github.com/your-username/Auto-FineTune-Ops.git
cd Auto-FineTune-Ops
# Create a virtual environment
python -m venv venv
# Windows:
.\venv\Scripts\activate
# Mac/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
2. Launch the Dashboard
streamlit run app.py
Open your browser to the URL shown (usually http://localhost:8501).
π οΈ Workflow Guide
Step 1: Data Upload
- Upload your raw
CSVorJSONfile containing instruction-response pairs. - The app automatically detects columns like
instruction,input,output. - Preview full dataset with pagination.
Step 2: Preprocessing
- Configure cleaning rules (HTML removal, lowercase, etc.).
- Set PII filters (mask emails/phones).
- Enable semantic deduplication.
- Click Run Pipeline to clean and format your data.
Step 3: Training
- If you have a GPU: Select a base model (e.g., Llama-3-8b) and click Start Training.
- If you have no GPU:
- Download the preprocessed data.
- Download the generated
Colab Notebook. - Run training on Google Colab (Free Tier).
- Upload the fine-tuned model results back to the app.
Step 4: Evaluation
- Compare your fine-tuned model vs. the base model.
- Select an AI Judge (e.g., GPT-4o).
- Visualize win rates and quality scores (Accuracy, Helpfulness, Tone).
Step 5: Deployment
- Deploy your model locally as a REST API:
python scripts/deploy.py --model ./output/models/your_model --port 8000 - Or push to HuggingFace Hub directly from the dashboard.
ποΈ Project Structure
ml_oops/
βββ app.py # π Main Streamlit Dashboard
βββ main.py # π§ CLI Orchestrator (Headless mode)
βββ requirements.txt # Dependencies
βββ agents/ # Core Logic Agents
β βββ data_architect.py # Data Analysis & Cleaning
β βββ training_pilot.py # Fine-Tuning Logic
β βββ the_judge.py # Evaluation Logic
βββ preprocessing/ # Advanced Preprocessing Modules
β βββ text_cleaning.py # Regex & Normalization
β βββ pii_filter.py # PII Redaction
β βββ deduplication.py # Semantic Dedupe
β βββ ...
βββ configs/ # Configuration Files
βββ output/ # Artifacts (Models, Logs, Reports)
live App
https://aneebnaqvi15-auto-finetune-ops-app-1xmv11.streamlit.app/
π€ Contributing
Contributions are welcome! Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
Built for modern ML teams.
Replace weeks of manual engineering with minutes of automated ops.
Replace weeks of manual engineering with minutes of automated ops.