SpyC0der77's picture
Update README.md
b50b6f2 verified
---
license: mit
datasets:
- ArtifactClfDurham/OrientalMuseum-white
language:
- en
base_model:
- google/efficientnet-b0
tags:
- artifact
- museum
---
# Artifact Classification Model v2 - Best Model Usage Guide
This directory contains the improved v2 artifact classification model with state-of-the-art performance for classifying museum artifacts by both object type and material.
## Hosted Model
The best model is available on Hugging Face at: **[SpyC0der77/artifact-efficientnet](https://huggingface.co/SpyC0der77/artifact-efficientnet)**
You can use the model directly from Hugging Face without downloading it locally.
## Model Overview
The v2 model is an advanced multi-output neural network that predicts two attributes simultaneously:
- **Object Name**: The type/category of the artifact (e.g., "vase", "statue", "pottery")
- **Material**: The material composition (e.g., "ceramic", "bronze", "stone")
### Key Improvements Over v1
- **EfficientNet Backbone**: Uses EfficientNet-B0 instead of ResNet-50 for better feature extraction
- **Attention Mechanism**: Includes an attention layer to focus on relevant features
- **Advanced Training**: Incorporates CutMix augmentation, Focal Loss, and mixed precision training
- **Better Regularization**: Uses dropout and batch normalization for improved generalization
## Architecture & Usage
The v2 model uses an EfficientNet-B0 backbone with an attention mechanism for multi-output classification. It processes RGB images of artifacts and outputs predictions for both object type and material composition.
### Input
- **Format**: RGB images (224Γ—224 pixels after preprocessing)
- **Preprocessing**: Resize to 256Γ—256, center crop to 224Γ—224, normalize with ImageNet statistics
### Output
- **Object Classification**: Predicts artifact type (e.g., "vase", "statue", "pottery")
- **Material Classification**: Predicts material composition (e.g., "ceramic", "bronze", "stone")
- **Confidence Scores**: Probability scores for each prediction
- **Format**: Dictionary with 'object_name' and 'material' logits
## Model Architecture
```python
ImprovedMultiOutputModel(
backbone: EfficientNet-B0 (pretrained)
attention: Linear(1280 β†’ 512 β†’ 1280) with Sigmoid
object_classifier: Linear(1280 β†’ 1024 β†’ 512 β†’ num_object_classes)
material_classifier: Linear(1280 β†’ 1024 β†’ 512 β†’ num_material_classes)
)
```
### Input Requirements
- **Image Size**: 224Γ—224 pixels (automatically resized and cropped)
- **Format**: RGB images
- **Normalization**: ImageNet normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
### Output Format
Returns a dictionary with:
- `'object_name'`: Logits for object classification
- `'material'`: Logits for material classification
## Training Details
The model was trained with the following configuration:
- **Dataset**: ArtifactClfDurham/OrientalMuseum-white
- **Training Split**: 85% of data
- **Validation Split**: 15% of data
- **Batch Size**: 32
- **Epochs**: 20
- **Optimizer**: AdamW with differential learning rates
- Backbone: 2e-4 (0.1Γ— base LR)
- Heads: 2e-3 (base LR)
- **Augmentation**: Advanced (CutMix, rotation, color jitter, Gaussian blur)
- **Loss Function**: Cross-Entropy (or Focal Loss if enabled)
- **Scheduler**: Cosine annealing with warmup
### Advanced Training Features
- **CutMix Augmentation**: Randomly mixes image patches between samples
- **Focal Loss**: Addresses class imbalance (optional)
- **Mixed Precision**: Automatic mixed precision training for speed
- **Gradient Scaling**: Prevents gradient underflow
- **Early Stopping**: Saves best model based on validation accuracy
## Troubleshooting
### Common Issues
1. **CUDA Out of Memory**
- Reduce batch size: `--batch_size 8`
- Use CPU: Set device to "cpu"
2. **Import Errors**
- Ensure all dependencies are installed
- Check Python path includes project root
3. **Model Loading Errors**
- Verify the model file path is correct
- Ensure PyTorch version compatibility
4. **Low Confidence Scores**
- Model may not be trained on similar artifacts
- Check image preprocessing matches training setup
### Performance Tips
- Use GPU for faster inference
- Process images in batches for efficiency
- Consider model quantization for deployment
## Model Limitations
- Trained specifically on Oriental Museum artifacts
- May not generalize well to artifacts from other cultures/regions
- Performance depends on image quality and lighting
- Multi-output nature may have trade-offs between object and material accuracy