Update README.md

b50b6f2 verified about 2 months ago

4.57 kB

	---
	license: mit
	datasets:
	- ArtifactClfDurham/OrientalMuseum-white
	language:
	- en
	base_model:
	- google/efficientnet-b0
	tags:
	- artifact
	- museum
	---

	# Artifact Classification Model v2 - Best Model Usage Guide

	This directory contains the improved v2 artifact classification model with state-of-the-art performance for classifying museum artifacts by both object type and material.

	## Hosted Model

	The best model is available on Hugging Face at: [SpyC0der77/artifact-efficientnet](https://huggingface.co/SpyC0der77/artifact-efficientnet)

	You can use the model directly from Hugging Face without downloading it locally.

	## Model Overview

	The v2 model is an advanced multi-output neural network that predicts two attributes simultaneously:
	- Object Name: The type/category of the artifact (e.g., "vase", "statue", "pottery")
	- Material: The material composition (e.g., "ceramic", "bronze", "stone")

	### Key Improvements Over v1
	- EfficientNet Backbone: Uses EfficientNet-B0 instead of ResNet-50 for better feature extraction
	- Attention Mechanism: Includes an attention layer to focus on relevant features
	- Advanced Training: Incorporates CutMix augmentation, Focal Loss, and mixed precision training
	- Better Regularization: Uses dropout and batch normalization for improved generalization

	## Architecture & Usage

	The v2 model uses an EfficientNet-B0 backbone with an attention mechanism for multi-output classification. It processes RGB images of artifacts and outputs predictions for both object type and material composition.

	### Input
	- Format: RGB images (224×224 pixels after preprocessing)
	- Preprocessing: Resize to 256×256, center crop to 224×224, normalize with ImageNet statistics

	### Output
	- Object Classification: Predicts artifact type (e.g., "vase", "statue", "pottery")
	- Material Classification: Predicts material composition (e.g., "ceramic", "bronze", "stone")
	- Confidence Scores: Probability scores for each prediction
	- Format: Dictionary with 'object_name' and 'material' logits

	## Model Architecture

	```python
	ImprovedMultiOutputModel(
	backbone: EfficientNet-B0 (pretrained)
	attention: Linear(1280 → 512 → 1280) with Sigmoid
	object_classifier: Linear(1280 → 1024 → 512 → num_object_classes)
	material_classifier: Linear(1280 → 1024 → 512 → num_material_classes)
	)
	```

	### Input Requirements
	- Image Size: 224×224 pixels (automatically resized and cropped)
	- Format: RGB images
	- Normalization: ImageNet normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

	### Output Format
	Returns a dictionary with:
	- `'object_name'`: Logits for object classification
	- `'material'`: Logits for material classification

	## Training Details

	The model was trained with the following configuration:

	- Dataset: ArtifactClfDurham/OrientalMuseum-white
	- Training Split: 85% of data
	- Validation Split: 15% of data
	- Batch Size: 32
	- Epochs: 20
	- Optimizer: AdamW with differential learning rates
	- Backbone: 2e-4 (0.1× base LR)
	- Heads: 2e-3 (base LR)
	- Augmentation: Advanced (CutMix, rotation, color jitter, Gaussian blur)
	- Loss Function: Cross-Entropy (or Focal Loss if enabled)
	- Scheduler: Cosine annealing with warmup

	### Advanced Training Features

	- CutMix Augmentation: Randomly mixes image patches between samples
	- Focal Loss: Addresses class imbalance (optional)
	- Mixed Precision: Automatic mixed precision training for speed
	- Gradient Scaling: Prevents gradient underflow
	- Early Stopping: Saves best model based on validation accuracy

	## Troubleshooting

	### Common Issues

	1. CUDA Out of Memory
	- Reduce batch size: `--batch_size 8`
	- Use CPU: Set device to "cpu"

	2. Import Errors
	- Ensure all dependencies are installed
	- Check Python path includes project root

	3. Model Loading Errors
	- Verify the model file path is correct
	- Ensure PyTorch version compatibility

	4. Low Confidence Scores
	- Model may not be trained on similar artifacts
	- Check image preprocessing matches training setup

	### Performance Tips

	- Use GPU for faster inference
	- Process images in batches for efficiency
	- Consider model quantization for deployment

	## Model Limitations

	- Trained specifically on Oriental Museum artifacts
	- May not generalize well to artifacts from other cultures/regions
	- Performance depends on image quality and lighting
	- Multi-output nature may have trade-offs between object and material accuracy