| | --- |
| | license: mit |
| | language: |
| | - en |
| | pipeline_tag: tabular-classification |
| | tags: |
| | - sklearn |
| | - classification |
| | - iris |
| | - tabular |
| | datasets: |
| | - brjapon/iris |
| | metrics: |
| | - accuracy |
| | library_name: scikit-learn |
| | new_version: "v1.0" |
| | model-index: |
| | - name: Iris Decision Tree |
| | results: |
| | - task: |
| | type: tabular-classification |
| | name: Classification |
| | metrics: |
| | - type: accuracy |
| | value: 0.97 |
| | name: Test Accuracy |
| | --- |
| | |
| | # Iris Classification Models |
| |
|
| | This repository starts with a **Decision Tree** model trained on the classic **Iris dataset**. The model classifies iris flowers into three species—*setosa*, *versicolor*, or *virginica*—based on four numeric features (sepal length, sepal width, petal length, and petal width). |
| |
|
| | Because of its small size and simplicity, this model is intended primarily for **demonstration and educational** purposes. |
| |
|
| | ## Model Description |
| | - **Framework**: [Scikit-Learn](https://scikit-learn.org/stable/) |
| | - **Algorithm**: Decision Tree (`DecisionTreeClassifier` class) |
| | - **Hyperparameters**: |
| | - Defaults for Decision Tree in Scikit-Learn |
| |
|
| | ### Intended Uses |
| | - **Education/Proof-of-Concept**: Demonstrates loading a scikit-learn model from the Hugging Face Hub. |
| | - **Beginner ML Tutorials**: Introduction to classification tasks, usage of Hugging Face model hosting, and deploying simple demos in Spaces. |
| |
|
| | ### Limitations |
| | - **Dataset Size**: The Iris dataset is small (150 samples). Performance metrics may not extrapolate to real-world scenarios. |
| | - **Domain Constraints**: The dataset only covers three iris species and may not generalize to other types of flowers. |
| | - **Not Production-Ready**: This model is not suited for critical applications (e.g., healthcare, autonomous vehicles). |
| |
|
| | ## How to Use |
| | To use this model, you can load the `.joblib` file from the Hub in Python code: |
| |
|
| | ```python |
| | import joblib |
| | from huggingface_hub import hf_hub_download |
| | |
| | # Accompanying dataset is hosted in Hugging Face under 'brjapon/iris' |
| | model_path = hf_hub_download(repo_id="brjapon/iris", |
| | filename="iris_dt.joblib", |
| | repo_type="model") |
| | |
| | model = joblib.load(model_path) |
| | |
| | # Example prediction (random values below) |
| | sample_input = [[5.1, 3.5, 1.4, 0.2]] |
| | prediction = model.predict(sample_input) |
| | print(prediction) # e.g., [0] which might correspond to 'setosa' |
| | ``` |
| |
|
| | ## Training Procedure |
| | - **Training Data**: 80% of the 150-sample Iris dataset (120 samples). |
| | - **Validation Data**: 20% (30 samples). |
| | - **Steps**: |
| | 1. Loaded dataset (obtained from HF repository `brjapon/iris`) |
| | 2. Split into training and test sets with `train_test_split` |
| | 3. Trained Decision Tree model with default settings |
| | 4. Evaluated accuracy on the test set |
| |
|
| | ## Performance |
| | Using a random 80/20 split, the model typically achieves **~97%** accuracy on the test subset. Actual results may vary depending on your specific train/test split random state. |
| |
|
| | ## Limitations & Bias |
| | - The Iris dataset is not representative of modern, large-scale classification tasks. |
| | - Results should not be generalized beyond the included species and scenario. |
| |
|