Update README.md

e1a624a verified 3 months ago

10.1 kB

	---
	tags:
	- pet_Health
	- veterinary
	license: mit
	language:
	- en
	metrics:
	- accuracy
	- f1
	base_model:
	- havocy28/VetBERTDx
	pipeline_tag: text-classification
	---

	# Model Card for Model ID

	This model classifies pet health symptoms from text descriptions into predefined health conditions, fine-tuned on VetBERTDx.



	## Model Details

	### Model Description

	Fine-tuned VetBERTDx for sequence classification.

	This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

	- Developed by: Fatemeh Dastak
	- Model type: Fine-tuned VetBERTDx for sequence classification
	- Language(s) (NLP): English
	- License: MIT
	- Finetuned from model [optional]: havocy28/VetBERTDx

	### Model Sources [optional]

	- Repository: https://huggingface.co/fdastak/model_classification
	- Dataset: [Pet Health Symptoms Dataset](https://www.kaggle.com/datasets/yyzz1010/pet-health-symptoms-dataset)

	## Uses

	### Direct Use
	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer

	model = AutoModelForSequenceClassification.from_pretrained("fdastak/model_classification")
	tokenizer = AutoTokenizer.from_pretrained("fdastak/model_classification")
	```

	### Out-of-Scope Use
	- Not for actual medical diagnosis
	- Not a replacement for veterinary consultation
	- Not suitable for emergency medical decisions

	### Downstream Use [optional]

	This model can be integrated into:

	- Veterinary triage systems
	- Pet health monitoring applications
	- Symptom screening tools
	- Educational veterinary platforms

	### Out-of-Scope Use

	This model should NOT be used for:

	- Direct medical diagnosis
	- Emergency medical decisions
	- Replacement of veterinary consultation
	- Legal or insurance decisions
	- Automated treatment recommendation

	## Bias, Risks, and Limitations

	## Technical Limitations

	- Limited to 512 token input length
	- CPU-only training constraints
	- Early stopping at 301 steps
	- Batch size limitations (8 training, 20 evaluation)
	- Specific to owner-reported symptoms

	## Data Biases

	- Training data from owner observations only
	- English language only
	- Limited to common pet conditions
	- Potential reporting biases in symptoms
	- Class imbalance considerations

	### Risk

	-Misinterpretation of medical conditions
	-Over-reliance on automated classification
	-Delayed professional consultation
	-False confidence in predictions
	-Language and cultural biases

	### Recommendations

	## Best Practices

	- Always verify predictions with professionals
	- Use as screening tool only
	- Monitor prediction confidence scores
	- Implement user warnings
	- Regular model evaluation

	## How to Get Started with the Model

	# Load required libraries
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch.nn.functional as F

	# Load model and tokenizer
	repo_id = "fdastak/model_classification"
	model = AutoModelForSequenceClassification.from_pretrained(repo_id)
	tokenizer = AutoTokenizer.from_pretrained(repo_id)

	# Example usage

	def classify_symptoms(text: str):
	# Preprocess and tokenize
	inputs = tokenizer(
	text,
	truncation=True,
	padding=True,
	max_length=512,
	return_tensors="pt"
	)


	## Training Details

	### Training Data
	- Source: Pet Health Symptoms Dataset (Kaggle)
	- Split: 80% training, 20% validation
	- Preprocessing: Text lowercasing, label encoding

	### Training Procedure

	#### Training Hyperparameters
	- Epochs: 5
	- Train batch size: 8
	- Eval batch size: 20
	- Learning rate: 2e-5
	- Scheduler: Linear with warmup
	- Warmup ratio: 0.1
	- Early stopping: At step 301
	- Maximum sequence length: 512

	### Evaluation

	#### Metrics
	- Accuracy
	- Precision (weighted)
	- Recall (weighted)
	- F1-score (weighted)
	-
	#### Speeds, Sizes, Times
	- Training Duration: ~1 hour
	- Steps: 301 (with early stopping)
	- Checkpoint Frequency: Every 50 steps
	- Batch Processing:
	- Training: 8 samples/batch
	- Evaluation: 20 samples/batch
	- Model Storage: Local checkpoints in './model_classification'

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data
	- Source: [Pet Health Symptoms Dataset](https://www.kaggle.com/datasets/yyzz1010/pet-health-symptoms-dataset)
	- Split: 20% of data (validation set)
	- Format: Text descriptions with condition labels
	- Preprocessing: Text lowercasing, label encoding

	#### Factors
	- Record Types: Owner observations
	- Text Length: Maximum 512 tokens
	- Language: English
	- Conditions: Multiple pet health conditions
	- Data Balance: Stratified split for class distribution

	#### Metrics
	- Accuracy: Overall classification accuracy
	- Precision (weighted): Measure of exactness
	- Recall (weighted): Measure of completeness
	- F1-score (weighted): Harmonic mean of precision and recall
	- Confusion Matrix: Class-wise performance visualization

	### Results

	#### Performance Summary
	- Overall Accuracy: 89%
	- Average F1-Score: 0.89
	- Class-wise Performance:
	- Class 0: Highest precision (0.97) and F1-score (0.95)
	- Class 1: Perfect recall (1.00)
	- Class 2: Balanced performance (0.93 across metrics)
	- Classes 3 & 4: Similar performance (~0.82-0.83 F1-score)

	#### Key Metrics
	- Precision (weighted): 0.89
	- Recall (weighted): 0.89
	- F1-score (weighted): 0.89
	- Support: 200 validation samples (40 per class)

	#### Summary
	- Model shows balanced performance across classes
	- Early stopping at step 301 prevents overfitting
	- Validation performed every 50 steps
	- Best model selected based on eval_loss
	- Confusion matrix shows class-wise performance


	## Model Examination

	### Validation Results
	The model's performance was examined using several evaluation methods:

	1. Classification Metrics
	- Computed using sklearn's classification_report
	- Includes precision, recall, and F1-score
	- Evaluated on validation dataset
	- Weighted averages to handle class imbalance

	2. Confusion Matrix Analysis
	```python
	# Visualization code
	from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
	import matplotlib.pyplot as plt

	model.eval()
	with torch.no_grad():
	# Prediction collection
	true_labels = []
	pred_labels = []
	pred_scores = []
	# ...evaluation logic
	```

	3. Prediction Confidence
	- Softmax probabilities for class predictions
	- Confidence scores tracked for each prediction
	- Score distribution analysis for reliability

	4. Early Stopping Analysis
	- Training stopped at step 301
	- Monitored eval_loss for best model selection
	- Used custom StopAtStepCallback for controlled training

	### Model Interpretability
	- Base model: VetBERTDx (domain-specific veterinary BERT)
	- Fine-tuned for pet symptom classification
	- Uses attention mechanisms for text understanding
	- Maximum sequence length: 512 tokens

	### Limitations
	- CPU-only training might affect model capacity
	- Limited to predefined condition categories
	- Performance varies by symptom complexity
	- Early stopping may affect final performance

	## Environmental Impact

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: CPU (Personal Computer)
	- Hours used: ~2 hours (301 steps with early stopping)
	- Cloud Provider: None (Local training)
	- Compute Region: USA (Colorado)
	- Power Mix: Rocky Mountain Power Grid
	- Training Configuration:
	- 301 steps with early stopping
	- CPU-based training
	- Batch size: 8 samples
	- Epochs: 5
	- Local machine execution

	Environmental considerations:
	- Used CPU instead of GPU for lower power consumption
	- Implemented early stopping at step 301
	- Leveraged pre-trained model (VetBERTDx)
	- Local training to minimize data center impact
	- Efficient batch size selection

	## Technical Specifications [optional]

	### Model Architecture and Objective

	- Base model: VetBERTDx
	- Task: Sequence classification
	- Input: Text descriptions of pet symptoms
	- Output: Classification among health conditions

	### Compute Infrastructure

	- Framework: PyTorch
	- Training device: GPU
	- Python dependencies:
	- transformers
	- torch
	- numpy
	- scikit-learn

	#### Hardware

	The model was trained using:
	- Training Device: CPU
	- Batch Configuration:
	- Training batch size: 8
	- Evaluation batch size: 20
	- Training Steps: Limited to 301 (early stopping)
	- Local Storage: Required for model checkpoints in './model_classification'

	#### Software

	Training environment specifications:
	- Python 3.11
	- Core Libraries:
	```python
	torch>=2.0.0
	transformers>=4.30.0
	numpy>=1.24.0
	pandas>=1.5.0
	scikit-learn>=1.0.0
	sentence-transformers>=2.2.0
	```
	- Training Components:
	- Framework: 🤗 Transformers
	- Base Model: havocy28/VetBERTDx
	- Tokenizer: AutoTokenizer
	- Model Class: AutoModelForSequenceClassification
	- Training API: Transformers Trainer with custom callbacks
	- Logging: Python's built-in logging module

	## Citation [optional]

	If you use this model in your research, please cite it using the following:

	BibTeX:
	```bibtex
	@misc{dastak2024pethealthclassifier,
	title={Pet Health Symptoms Classification Model},
	author={Dastak, Fatemeh},
	year={2024},
	publisher={Hugging Face},
	howpublished={\url{https://huggingface.co/fdastak/model_classification}},
	note={Based on VetBERTDx by Havocy28},
	keywords={veterinary-nlp, text-classification, pet-health}
	}
	```

	APA:
	```
	Dastak, F. (2025). Pet Health Symptoms Classification Model [Machine learning model]. Hugging Face Model Hub. https://huggingface.co/fdastak/model_classification
	```

	Please also cite the base model:
	```
	@misc{havocy282023vetbertdx,
	title={VetBERTDx: A Domain-Specific Language Model for Veterinary Medicine},
	author={Havocy28},
	year={2023},
	publisher={Hugging Face},
	howpublished={\url{https://huggingface.co/havocy28/VetBERTDx}}
	}
	```


	## Model Card Contact

	Author: Fatemeh Dastak
	Repository: https://huggingface.co/fdastak/model_classification