File size: 1,597 Bytes
89cf228 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
# Digit Recognition
## Intended Use
This model is designed to classify handwritten digits (0-9) based on pixel values from the MNIST-like dataset. It is intended for educational purposes and to demonstrate the use of Random Forest for multi-class classification.
## Training Data
- **Dataset**: The model was trained on a dataset with 42,000 samples, where each sample is a 28x28 grayscale image flattened into a vector of 784 pixel values.
- **Labels**: The dataset contains 10 classes (digits 0-9).
- **Train-Test Split**:
- Training set: 33,600 samples (80%)
- Validation set: 8,400 samples (20%)
## Evaluation Metrics
- **Accuracy**: The model achieved an accuracy of approximately `accuracy_score(y_val, y_pred)` on the validation set.
- **Classification Report**: Includes precision, recall, and F1-score for each class.
- **Confusion Matrix**: Visualized to show the distribution of predictions across classes.
## Limitations
- The model may not generalize well to digits written in styles significantly different from the training data.
- It is not optimized for real-time or large-scale applications.
## Ethical Considerations
- Ensure the dataset used does not contain any biases that could affect the fairness of the model.
- The model should not be used in critical applications without further validation and testing.
## How to Use
1. Load the model using `joblib.load('digit_rf_model.joblib')`.
2. Preprocess the input data to match the format of the training data (28x28 images flattened into 784-pixel vectors).
3. Use the `predict` method to classify new samples. |