# Digit Recognition

## Intended Use
This model is designed to classify handwritten digits (0-9) based on pixel values from the MNIST-like dataset. It is intended for educational purposes and to demonstrate the use of Random Forest for multi-class classification.

## Training Data
- **Dataset**: The model was trained on a dataset with 42,000 samples, where each sample is a 28x28 grayscale image flattened into a vector of 784 pixel values.
- **Labels**: The dataset contains 10 classes (digits 0-9).
- **Train-Test Split**: 
    - Training set: 33,600 samples (80%)
    - Validation set: 8,400 samples (20%)

## Evaluation Metrics
- **Accuracy**: The model achieved an accuracy of approximately `accuracy_score(y_val, y_pred)` on the validation set.
- **Classification Report**: Includes precision, recall, and F1-score for each class.
- **Confusion Matrix**: Visualized to show the distribution of predictions across classes.

## Limitations
- The model may not generalize well to digits written in styles significantly different from the training data.
- It is not optimized for real-time or large-scale applications.

## Ethical Considerations
- Ensure the dataset used does not contain any biases that could affect the fairness of the model.
- The model should not be used in critical applications without further validation and testing.

## How to Use
1. Load the model using `joblib.load('digit_rf_model.joblib')`.
2. Preprocess the input data to match the format of the training data (28x28 images flattened into 784-pixel vectors).
  3. Use the `predict` method to classify new samples.