---
license: mit
library_name: sklearn
tags:
  - sklearn
  - classification
  - random-forest
  - food-science
  - milk-quality
pipeline_tag: tabular-classification
---

# Milk Spoilage Classification Model

A Random Forest classifier for predicting milk spoilage type based on microbial count data.

## Model Description

This model classifies milk samples into three spoilage categories based on Standard Plate Count (SPC) and Total Gram-Negative (TGN) bacterial counts measured at days 7, 14, and 21 of shelf life.

### Classes

- **PPC**: Post-Pasteurization Contamination
- **no spoilage**: No spoilage detected
- **spore spoilage**: Spore-forming bacteria spoilage

### Input Features

| Feature | Description |
|---------|-------------|
| SPC_D7 | Standard Plate Count at Day 7 (log CFU/mL) |
| SPC_D14 | Standard Plate Count at Day 14 (log CFU/mL) |
| SPC_D21 | Standard Plate Count at Day 21 (log CFU/mL) |
| TGN_D7 | Total Gram-Negative count at Day 7 (log CFU/mL) |
| TGN_D14 | Total Gram-Negative count at Day 14 (log CFU/mL) |
| TGN_D21 | Total Gram-Negative count at Day 21 (log CFU/mL) |

## Performance

- **Test Accuracy**: 95.76%

## Usage

### Using the Inference API

```python
import requests

API_URL = "https://api-inference.huggingface.co/models/chenhaoq87/MilkSpoilageClassifier"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

# Input: [SPC_D7, SPC_D14, SPC_D21, TGN_D7, TGN_D14, TGN_D21]
payload = {"inputs": [[4.5, 5.2, 6.1, 3.2, 4.0, 4.8]]}

response = requests.post(API_URL, headers=headers, json=payload)
print(response.json())
```

### Local Usage

```python
import joblib
import numpy as np

# Load the model
model = joblib.load("model/model.joblib")

# Prepare input features
# [SPC_D7, SPC_D14, SPC_D21, TGN_D7, TGN_D14, TGN_D21]
features = np.array([[4.5, 5.2, 6.1, 3.2, 4.0, 4.8]])

# Make prediction
prediction = model.predict(features)
probabilities = model.predict_proba(features)

print(f"Predicted class: {prediction[0]}")
print(f"Class probabilities: {dict(zip(model.classes_, probabilities[0]))}")
```

## Model Details

- **Model Type**: Random Forest Classifier
- **Framework**: scikit-learn
- **Number of Estimators**: 100
- **Max Depth**: None (unlimited)
- **Min Samples Split**: 5
- **Min Samples Leaf**: 1

## Citation

If you use this model, please cite the original research on milk spoilage classification.

## Repository Structure

```
MilkSpoilageClassifier/
├── apps/
│   ├── fastapi/        # REST API application
│   ├── gradio/         # Interactive web interface
│   └── huggingface/    # HF Inference handler
├── data/               # Training and test datasets
├── docs/               # Documentation files
├── model/              # Trained model artifacts
├── notebooks/          # Jupyter notebooks for analysis
├── scripts/            # Utility scripts
└── README.md           # This file
```

See individual README files in each directory for more details.

## Quick Start

### Train the Model
```bash
python scripts/prepare_model.py
```

### Run Gradio Interface
```bash
cd apps/gradio
pip install -r requirements.txt
python app.py
```

### Run FastAPI Server
```bash
cd apps/fastapi
pip install -r requirements.txt
python app.py
```

### Deploy to Hugging Face
```bash
python scripts/upload_to_hf.py
```

## License

MIT License