---
language: en
license: apache-2.0
tags:
- distilbert
- text-classification
- animals
- education
- final-project
datasets:
- Isamu136/big-animal-dataset
metrics:
- accuracy
- f1
---

# ZooGuide-BERT Animal Fact Assistant

This model is a DistilBERT-based animal text classification model created for an INFOST 470 final project.

## Model Details

- **Base model:** distilbert-base-uncased
- **Fine-tuned model:** cloudwoowoo/finalprojectanimal
- **Task:** Multi-class text classification
- **Dataset:** Isamu136/big-animal-dataset
- **Dataset link:** https://huggingface.co/datasets/Isamu136/big-animal-dataset

## Dataset

This project uses the public Hugging Face dataset `Isamu136/big-animal-dataset`. The original dataset contains animal image examples and caption/class labels. Because DistilBERT is a text model, the public caption/class labels were converted into short animal clue and fact sentences for text classification.

## Training Details

- **Model type:** DistilBERT sequence classifier
- **Epochs:** 5
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Train/validation/test split:** 70/15/15
- **Focused classes:** 10
- **Training examples:** 2100
- **Validation examples:** 450
- **Test examples:** 450

## Evaluation

Final evaluation results after running the notebook:

- **Accuracy:** 1.0000
- **Macro F1:** 1.0000

## Intended Uses

This model is intended for educational demonstrations, animal learning activities, classroom examples, and small text-classification projects.

## Limitations

This model is not a general animal expert. It only predicts among the focused animal classes selected from the public dataset. It may perform poorly on animals not included in the selected training labels, vague clues, misspellings, or prompts outside the project domain.

## How to Load the Model

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_id = "cloudwoowoo/finalprojectanimal"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
classifier("Tell me a fact about a dog.")
```