|
|
--- |
|
|
title: TrainAI |
|
|
emoji: 👁 |
|
|
colorFrom: pink |
|
|
colorTo: yellow |
|
|
sdk: gradio |
|
|
sdk_version: 5.31.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: apache-2.0 |
|
|
short_description: You can train any simple models |
|
|
--- |
|
|
|
|
|
# 🧠 Universal CPU AI Trainer ⚙️ |
|
|
|
|
|
Welcome to the Universal CPU AI Trainer! This Hugging Face Space allows you to: |
|
|
|
|
|
* **Define AI Tasks:** Choose from Tabular Classification, Tabular Regression, or basic Image Classification. |
|
|
* **Select Model Families:** Experiment with classical Scikit-learn models or simpler PyTorch Neural Networks (MLPs, basic CNNs). |
|
|
* **Configure Datasets:** |
|
|
* Generate synthetic datasets with configurable rows, features, and characteristics. |
|
|
* Let the "AI Assistant" (heuristic rules) suggest dataset parameters. |
|
|
* Upload your own datasets (CSV, JSON, Parquet). |
|
|
* **Design Neural Networks:** For PyTorch MLPs, specify hidden layers and get suggestions for target parameter counts (10k - 1M). |
|
|
* **Train Models on CPU:** All training happens on the free CPU tier. Be patient with larger models or datasets! |
|
|
* **Evaluate & Download:** Get basic evaluation metrics and download your trained models (PKL, ONNX for Scikit-learn; PT for PyTorch). |
|
|
|
|
|
**⚠️ Important Considerations for CPU Training:** |
|
|
|
|
|
* **Performance:** This Space runs on a **free CPU tier**. Training complex models (especially Neural Networks with >100k parameters) or large datasets will be **SLOW**. An epoch can take minutes to hours. |
|
|
* **Memory Limits:** The free tier has limited RAM (~15GB). Very large datasets or models might cause the Space to crash. |
|
|
* **Toy Examples:** The "Basic Image Classification" task uses randomly generated pixel data, not real images. It's for demonstrating the CNN pipeline structure on CPU. |
|
|
* **Experimental:** This is a tool for learning and experimentation, not for production-grade model training. |
|
|
|
|
|
## How to Use |
|
|
|
|
|
1. **Tab 1: Define Task & Model** |
|
|
* Select your desired **Task Type** (e.g., Tabular Classification). |
|
|
* Choose a **Model Family** (Scikit-learn or PyTorch). |
|
|
* Select the **Specific Model**. |
|
|
* If using PyTorch NNs: |
|
|
* Select a **Target Parameter Range** (e.g., "Small (10k-50k)"). |
|
|
* For MLPs, configure **Hidden Layers** or use the "Suggest MLP Layers" button (after defining a dataset in Tab 2 for better dimension estimates). |
|
|
|
|
|
2. **Tab 2: Configure Dataset** |
|
|
* Choose to **Generate** a new dataset or **Upload** your own. |
|
|
* **Generation:** Specify rows, features, etc., or use the "AI suggest" checkbox. |
|
|
* **Upload:** Provide your CSV, JSON, or Parquet file. |
|
|
* Enter the **Target Column Name** from your dataset. |
|
|
* Click "Generate & Preview Dataset" or let the upload complete. |
|
|
|
|
|
3. **Tab 3: Train Model & Get Results** |
|
|
* Adjust **Training Hyperparameters** (Epochs, Batch Size, Learning Rate - primarily for NNs). |
|
|
* Select the desired **Model Output Format**. |
|
|
* Click "🚀 Train Model". |
|
|
* Monitor the **Training Log**. |
|
|
* View **Evaluation Metrics**, **Model Parameters**, and (for PyTorch) a **Loss Curve**. |
|
|
* Download your trained model using the **Download Trained Model** button. |
|
|
|
|
|
## Model Output Formats |
|
|
|
|
|
* **Scikit-learn:** |
|
|
* `.pkl`: Python pickle file containing the Scikit-learn pipeline (preprocessor + model). |
|
|
* `.onnx`: Open Neural Network Exchange format. The exported ONNX model includes the preprocessing steps and expects raw input matching the original training data structure. |
|
|
* **PyTorch:** |
|
|
* `.pt`: PyTorch file. For MLPs trained on tabular data, this bundles the model's `state_dict` and the Scikit-learn `preprocessor` used. For CNNs, it's typically the `state_dict`. |
|
|
|
|
|
## Want More Power? Clone & Upgrade! |
|
|
|
|
|
If training is too slow or you hit resource limits: |
|
|
|
|
|
1. Go to this Space's main page. |
|
|
2. Click the three dots (⋮) menu and select "Duplicate this Space." |
|
|
3. On the creation page, choose upgraded **Space Hardware** (e.g., better CPU or a GPU - these are paid options). |
|
|
4. Create your new, more powerful Space! (You'll likely need to re-upload/re-generate data). |
|
|
|
|
|
## Development & Contributions |
|
|
|
|
|
This Space is built with Python, Gradio, Scikit-learn, and PyTorch. |
|
|
* **Main Application Logic:** `app.py` |
|
|
* **Dependencies:** `requirements.txt` |
|
|
|
|
|
Feel free to explore the code, suggest improvements, or report issues! |
|
|
|
|
|
## License |
|
|
This project is licensed under the **Apache License 2.0**. See the `LICENSE` file for details. |