Update README.md
Browse files
README.md
CHANGED
|
@@ -11,4 +11,76 @@ license: apache-2.0
|
|
| 11 |
short_description: You can train any simple models
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
short_description: You can train any simple models
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# 🧠 Universal CPU AI Trainer ⚙️
|
| 15 |
+
|
| 16 |
+
Welcome to the Universal CPU AI Trainer! This Hugging Face Space allows you to:
|
| 17 |
+
|
| 18 |
+
* **Define AI Tasks:** Choose from Tabular Classification, Tabular Regression, or basic Image Classification.
|
| 19 |
+
* **Select Model Families:** Experiment with classical Scikit-learn models or simpler PyTorch Neural Networks (MLPs, basic CNNs).
|
| 20 |
+
* **Configure Datasets:**
|
| 21 |
+
* Generate synthetic datasets with configurable rows, features, and characteristics.
|
| 22 |
+
* Let the "AI Assistant" (heuristic rules) suggest dataset parameters.
|
| 23 |
+
* Upload your own datasets (CSV, JSON, Parquet).
|
| 24 |
+
* **Design Neural Networks:** For PyTorch MLPs, specify hidden layers and get suggestions for target parameter counts (10k - 1M).
|
| 25 |
+
* **Train Models on CPU:** All training happens on the free CPU tier. Be patient with larger models or datasets!
|
| 26 |
+
* **Evaluate & Download:** Get basic evaluation metrics and download your trained models (PKL, ONNX for Scikit-learn; PT for PyTorch).
|
| 27 |
+
|
| 28 |
+
**⚠️ Important Considerations for CPU Training:**
|
| 29 |
+
|
| 30 |
+
* **Performance:** This Space runs on a **free CPU tier**. Training complex models (especially Neural Networks with >100k parameters) or large datasets will be **SLOW**. An epoch can take minutes to hours.
|
| 31 |
+
* **Memory Limits:** The free tier has limited RAM (~15GB). Very large datasets or models might cause the Space to crash.
|
| 32 |
+
* **Toy Examples:** The "Basic Image Classification" task uses randomly generated pixel data, not real images. It's for demonstrating the CNN pipeline structure on CPU.
|
| 33 |
+
* **Experimental:** This is a tool for learning and experimentation, not for production-grade model training.
|
| 34 |
+
|
| 35 |
+
## How to Use
|
| 36 |
+
|
| 37 |
+
1. **Tab 1: Define Task & Model**
|
| 38 |
+
* Select your desired **Task Type** (e.g., Tabular Classification).
|
| 39 |
+
* Choose a **Model Family** (Scikit-learn or PyTorch).
|
| 40 |
+
* Select the **Specific Model**.
|
| 41 |
+
* If using PyTorch NNs:
|
| 42 |
+
* Select a **Target Parameter Range** (e.g., "Small (10k-50k)").
|
| 43 |
+
* For MLPs, configure **Hidden Layers** or use the "Suggest MLP Layers" button (after defining a dataset in Tab 2 for better dimension estimates).
|
| 44 |
+
|
| 45 |
+
2. **Tab 2: Configure Dataset**
|
| 46 |
+
* Choose to **Generate** a new dataset or **Upload** your own.
|
| 47 |
+
* **Generation:** Specify rows, features, etc., or use the "AI suggest" checkbox.
|
| 48 |
+
* **Upload:** Provide your CSV, JSON, or Parquet file.
|
| 49 |
+
* Enter the **Target Column Name** from your dataset.
|
| 50 |
+
* Click "Generate & Preview Dataset" or let the upload complete.
|
| 51 |
+
|
| 52 |
+
3. **Tab 3: Train Model & Get Results**
|
| 53 |
+
* Adjust **Training Hyperparameters** (Epochs, Batch Size, Learning Rate - primarily for NNs).
|
| 54 |
+
* Select the desired **Model Output Format**.
|
| 55 |
+
* Click "🚀 Train Model".
|
| 56 |
+
* Monitor the **Training Log**.
|
| 57 |
+
* View **Evaluation Metrics**, **Model Parameters**, and (for PyTorch) a **Loss Curve**.
|
| 58 |
+
* Download your trained model using the **Download Trained Model** button.
|
| 59 |
+
|
| 60 |
+
## Model Output Formats
|
| 61 |
+
|
| 62 |
+
* **Scikit-learn:**
|
| 63 |
+
* `.pkl`: Python pickle file containing the Scikit-learn pipeline (preprocessor + model).
|
| 64 |
+
* `.onnx`: Open Neural Network Exchange format. The exported ONNX model includes the preprocessing steps and expects raw input matching the original training data structure.
|
| 65 |
+
* **PyTorch:**
|
| 66 |
+
* `.pt`: PyTorch file. For MLPs trained on tabular data, this bundles the model's `state_dict` and the Scikit-learn `preprocessor` used. For CNNs, it's typically the `state_dict`.
|
| 67 |
+
|
| 68 |
+
## Want More Power? Clone & Upgrade!
|
| 69 |
+
|
| 70 |
+
If training is too slow or you hit resource limits:
|
| 71 |
+
|
| 72 |
+
1. Go to this Space's main page.
|
| 73 |
+
2. Click the three dots (⋮) menu and select "Duplicate this Space."
|
| 74 |
+
3. On the creation page, choose upgraded **Space Hardware** (e.g., better CPU or a GPU - these are paid options).
|
| 75 |
+
4. Create your new, more powerful Space! (You'll likely need to re-upload/re-generate data).
|
| 76 |
+
|
| 77 |
+
## Development & Contributions
|
| 78 |
+
|
| 79 |
+
This Space is built with Python, Gradio, Scikit-learn, and PyTorch.
|
| 80 |
+
* **Main Application Logic:** `app.py`
|
| 81 |
+
* **Dependencies:** `requirements.txt`
|
| 82 |
+
|
| 83 |
+
Feel free to explore the code, suggest improvements, or report issues!
|
| 84 |
+
|
| 85 |
+
## License
|
| 86 |
+
This project is licensed under the **Apache License 2.0**. See the `LICENSE` file for details.
|