File size: 10,132 Bytes

---
tags:
- pet_Health
- veterinary
license: mit
language:
- en
metrics:
- accuracy
- f1
base_model:
- havocy28/VetBERTDx
pipeline_tag: text-classification
---

# Model Card for Model ID

This model classifies pet health symptoms from text descriptions into predefined health conditions, fine-tuned on VetBERTDx.



## Model Details

### Model Description

Fine-tuned VetBERTDx for sequence classification.

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

- **Developed by:** Fatemeh Dastak
- **Model type:** Fine-tuned VetBERTDx for sequence classification
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model [optional]:** havocy28/VetBERTDx

### Model Sources [optional]

- **Repository:** https://huggingface.co/fdastak/model_classification
- **Dataset:** [Pet Health Symptoms Dataset](https://www.kaggle.com/datasets/yyzz1010/pet-health-symptoms-dataset)

## Uses

### Direct Use
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("fdastak/model_classification")
tokenizer = AutoTokenizer.from_pretrained("fdastak/model_classification")
```

### Out-of-Scope Use
- Not for actual medical diagnosis
- Not a replacement for veterinary consultation
- Not suitable for emergency medical decisions

### Downstream Use [optional]

This model can be integrated into:

- Veterinary triage systems
- Pet health monitoring applications
- Symptom screening tools
- Educational veterinary platforms

### Out-of-Scope Use

This model should NOT be used for:

- Direct medical diagnosis
- Emergency medical decisions
- Replacement of veterinary consultation
- Legal or insurance decisions
- Automated treatment recommendation

## Bias, Risks, and Limitations

## Technical Limitations

- Limited to 512 token input length
- CPU-only training constraints
- Early stopping at 301 steps
- Batch size limitations (8 training, 20 evaluation)
- Specific to owner-reported symptoms

## Data Biases

- Training data from owner observations only
- English language only
- Limited to common pet conditions
- Potential reporting biases in symptoms
- Class imbalance considerations

### Risk

-Misinterpretation of medical conditions
-Over-reliance on automated classification
-Delayed professional consultation
-False confidence in predictions
-Language and cultural biases

### Recommendations

## Best Practices

- Always verify predictions with professionals
- Use as screening tool only
- Monitor prediction confidence scores
- Implement user warnings
- Regular model evaluation

## How to Get Started with the Model

# Load required libraries
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch.nn.functional as F

# Load model and tokenizer
repo_id = "fdastak/model_classification"
model = AutoModelForSequenceClassification.from_pretrained(repo_id)
tokenizer = AutoTokenizer.from_pretrained(repo_id)

# Example usage

def classify_symptoms(text: str):
    # Preprocess and tokenize
    inputs = tokenizer(
        text,
        truncation=True,
        padding=True,
        max_length=512,
        return_tensors="pt"
    )
    

## Training Details

### Training Data
- Source: Pet Health Symptoms Dataset (Kaggle)
- Split: 80% training, 20% validation
- Preprocessing: Text lowercasing, label encoding

### Training Procedure

#### Training Hyperparameters
- Epochs: 5
- Train batch size: 8
- Eval batch size: 20
- Learning rate: 2e-5
- Scheduler: Linear with warmup
- Warmup ratio: 0.1
- Early stopping: At step 301
- Maximum sequence length: 512

### Evaluation

#### Metrics
- Accuracy
- Precision (weighted)
- Recall (weighted)
- F1-score (weighted)
- 
#### Speeds, Sizes, Times
- **Training Duration**: ~1 hour
- **Steps**: 301 (with early stopping)
- **Checkpoint Frequency**: Every 50 steps
- **Batch Processing**:
  - Training: 8 samples/batch
  - Evaluation: 20 samples/batch
- **Model Storage**: Local checkpoints in './model_classification'

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data
- **Source**: [Pet Health Symptoms Dataset](https://www.kaggle.com/datasets/yyzz1010/pet-health-symptoms-dataset)
- **Split**: 20% of data (validation set)
- **Format**: Text descriptions with condition labels
- **Preprocessing**: Text lowercasing, label encoding

#### Factors
- **Record Types**: Owner observations
- **Text Length**: Maximum 512 tokens
- **Language**: English
- **Conditions**: Multiple pet health conditions
- **Data Balance**: Stratified split for class distribution

#### Metrics
- **Accuracy**: Overall classification accuracy
- **Precision (weighted)**: Measure of exactness
- **Recall (weighted)**: Measure of completeness
- **F1-score (weighted)**: Harmonic mean of precision and recall
- **Confusion Matrix**: Class-wise performance visualization

### Results

#### Performance Summary
- Overall Accuracy: 89%
- Average F1-Score: 0.89
- Class-wise Performance:
  - Class 0: Highest precision (0.97) and F1-score (0.95)
  - Class 1: Perfect recall (1.00)
  - Class 2: Balanced performance (0.93 across metrics)
  - Classes 3 & 4: Similar performance (~0.82-0.83 F1-score)

#### Key Metrics
- **Precision (weighted)**: 0.89
- **Recall (weighted)**: 0.89
- **F1-score (weighted)**: 0.89
- **Support**: 200 validation samples (40 per class)

#### Summary
- Model shows balanced performance across classes
- Early stopping at step 301 prevents overfitting
- Validation performed every 50 steps
- Best model selected based on eval_loss
- Confusion matrix shows class-wise performance


## Model Examination 

### Validation Results
The model's performance was examined using several evaluation methods:

1. **Classification Metrics**
- Computed using sklearn's classification_report
- Includes precision, recall, and F1-score
- Evaluated on validation dataset
- Weighted averages to handle class imbalance

2. **Confusion Matrix Analysis**
```python
# Visualization code
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

model.eval()
with torch.no_grad():
    # Prediction collection
    true_labels = []
    pred_labels = []
    pred_scores = []
    # ...evaluation logic
```

3. **Prediction Confidence**
- Softmax probabilities for class predictions
- Confidence scores tracked for each prediction
- Score distribution analysis for reliability

4. **Early Stopping Analysis**
- Training stopped at step 301
- Monitored eval_loss for best model selection
- Used custom StopAtStepCallback for controlled training

### Model Interpretability
- Base model: VetBERTDx (domain-specific veterinary BERT)
- Fine-tuned for pet symptom classification
- Uses attention mechanisms for text understanding
- Maximum sequence length: 512 tokens

### Limitations
- CPU-only training might affect model capacity
- Limited to predefined condition categories
- Performance varies by symptom complexity
- Early stopping may affect final performance

## Environmental Impact

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** CPU (Personal Computer)
- **Hours used:** ~2 hours (301 steps with early stopping)
- **Cloud Provider:** None (Local training)
- **Compute Region:** USA (Colorado)
- **Power Mix:** Rocky Mountain Power Grid
- **Training Configuration:**
  - 301 steps with early stopping
  - CPU-based training
  - Batch size: 8 samples
  - Epochs: 5
  - Local machine execution

Environmental considerations:
- Used CPU instead of GPU for lower power consumption
- Implemented early stopping at step 301
- Leveraged pre-trained model (VetBERTDx)
- Local training to minimize data center impact
- Efficient batch size selection

## Technical Specifications [optional]

### Model Architecture and Objective

- Base model: VetBERTDx
- Task: Sequence classification
- Input: Text descriptions of pet symptoms
- Output: Classification among health conditions

### Compute Infrastructure

- Framework: PyTorch
- Training device: GPU
- Python dependencies:
  - transformers
  - torch
  - numpy
  - scikit-learn

#### Hardware

The model was trained using:
- Training Device: CPU
- Batch Configuration:
  - Training batch size: 8
  - Evaluation batch size: 20
- Training Steps: Limited to 301 (early stopping)
- Local Storage: Required for model checkpoints in './model_classification'

#### Software

Training environment specifications:
- Python 3.11
- Core Libraries:
  ```python
  torch>=2.0.0
  transformers>=4.30.0
  numpy>=1.24.0
  pandas>=1.5.0
  scikit-learn>=1.0.0
  sentence-transformers>=2.2.0
  ```
- Training Components:
  - Framework: 🤗 Transformers
  - Base Model: havocy28/VetBERTDx
  - Tokenizer: AutoTokenizer
  - Model Class: AutoModelForSequenceClassification
  - Training API: Transformers Trainer with custom callbacks
- Logging: Python's built-in logging module
  
## Citation [optional]

If you use this model in your research, please cite it using the following:

**BibTeX:**
```bibtex
@misc{dastak2024pethealthclassifier,
    title={Pet Health Symptoms Classification Model},
    author={Dastak, Fatemeh},
    year={2024},
    publisher={Hugging Face},
    howpublished={\url{https://huggingface.co/fdastak/model_classification}},
    note={Based on VetBERTDx by Havocy28},
    keywords={veterinary-nlp, text-classification, pet-health}
}
```

**APA:**
```
Dastak, F. (2025). Pet Health Symptoms Classification Model [Machine learning model]. Hugging Face Model Hub. https://huggingface.co/fdastak/model_classification
```

Please also cite the base model:
```
@misc{havocy282023vetbertdx,
    title={VetBERTDx: A Domain-Specific Language Model for Veterinary Medicine},
    author={Havocy28},
    year={2023},
    publisher={Hugging Face},
    howpublished={\url{https://huggingface.co/havocy28/VetBERTDx}}
}
```


## Model Card Contact

Author: Fatemeh Dastak
Repository: https://huggingface.co/fdastak/model_classification